Plate reader files

tp2750 · May 28, 2018, 7:30pm

I work a lot with plate readers for measuring enzyme kinetics.
Every reader has it’s own output-format (often several - depending on settings), and they are generally a pain to parse.

I am contemplating writing a package with parsers for this kind of equipment (kinetic, endpoint, absorbance, fluorescence, …).

The only existing example I have found is getEnVisionRawData() from the cellHTS2 bioconductor package. The plater package (https://github.com/ropensci/plater) is dealing with similar data, and could be a good starting point.

Would anybody here be interested in collaborating on such a package?

stefanie · May 28, 2018, 7:33pm

@seaaan fyi

seaaan · May 29, 2018, 3:54am

Thanks @stefanie!

I would be happy to collaborate on something like this if I could be helpful. The approach I took in plater was to create a general purpose format that people could conveniently copy-and-paste their data into, regardless of the specific format from the particular instrument. This has the advantages of not having to provide a separate function for every instrument and storing metadata about the wells in the same file. It has the disadvantage that users have to copy and paste from the raw instrument file into a new file.

If I understand your goal correctly, you would like to automatically read in the data in the format provided by the instrument. I could imagine doing this in a couple of ways: (1) functions that convert from the raw instrument output to plater format and write a CSV file that users could add any necessary additional data to or (2) functions that convert from the raw instrument output to a tidy data frame. Another possibility would be to create a framework where users could easily create their own function to handle the raw data from their specific instrument.

Happy to help and discuss further! Let me know how I can be useful to you.

tp2750 · May 29, 2018, 6:02am

Hi @seaaan,

Thanks a lot for offering to help.

What I have in mind at present is mostly your suggestion 2: parse the raw instrument file to a tidy data frame, and perhaps in time evolve a framework for easily writing these parsers.

I also like your idea of letting users add extra data (eg sample identifiers and location of controls) in plater-format.

Here’s my initial thoughts on which columns such a data-frame should have for a kinetic absorbance experiment in 384-well format:

readerfile (char) the name of the raw file parsed
barcode (char) the barcode of the plate (my reader can read barcodes)
well384 (char) the well (A01, A02, … P24)
absorbance_nm (num) the detected wavelength in nm, eg 405
kinetic_step (num) the cycle number (1, 2, … up to number of kinetic steps)
kinetic_sec (num) seconds since beginning of experiment
OD (num) the measured intensity (for absorbance typically between 0 and 3)
chamber_temperature_C (num) the temperature in degrees Celsius.
warnings (char) warnings reported for the plate or well

So the first row could look like this in csv-format:
readerfile,barcode,well384,absorbance_nm,kinetic_step,kinetic_sec,OD,chamber_temperature_C,warnings
exp01.xlsx,000XCFR,A01,405,1,3,0.0003,23.2,

I consider this to be the minimum of information needed for analyzing the results (I guess the kinetic_step is redundant, but it is very convenient). Any immediate comments on this?

Further columns I’m considering, but less sure about:

kintic_timestamp (date time) wall clock time-stamp of measurement
table_version (char) the name (including version) of this format. In the case above it could be “kinetic_absorbance_384_v1”

The table_version could also be an S3 class, but an advantage of a column is that it survives being stored as a csv-file.

Please comment.

seaaan · May 29, 2018, 3:15pm

That seems fine. Can you post an example raw data file? I would be in favor of including all the columns that you might use. I am also always in favor of storing as much as possible in data frames rather than making a special object, since this makes it more interoperable with other packages, so I would recommend against doing a special object. But you’re the one who will be using it so it’s up to you!

It should be straightforward for me (I think!) to write a function that takes a raw file and gives back the data frame you describe. Do you want me to do that? Or are you looking more for advice on how to go about it? Happy to help however is useful.

tp2750 · May 29, 2018, 5:36pm

Hi @seaaan,

Thanks for the kind offer to help on the implementation. For now I’m mostly looking for advice on the columns of the data frame to store the data in. I find naming stuff surprisingly difficult
I appreciate your advice to include all relevant information. If we get this right, I hope I will not be the only user of this down the road.

Here’s a second version of the format, please comment:

table_version (char) the name (including version) of this format. In this case: “kinetic_absorbance_384_v1”
readerfile (char) the name of the raw file parsed
readerplate_barcode (char) the barcode of the plate (my reader can read barcodes)
well384 (char) the well (A01, A02, … P24)
absorbance_nm (num) the detected wavelength in nm, eg 405
kinetic_step (num) the cycle number (1, 2, … up to number of kinetic steps)
kinetic_sec (num) seconds since beginning of experiment
kinetic_timestamp (date time) wall-clock timestamp (ISO 8601) of the reading (eg 2018-05-29T15:29:00Z)
absorbance_value (num) the measured intensity (for absorbance typically between 0 and 3)
chamber_temperature_C (num) the temperature in degrees Celsius.
warnings (char) warnings reported for the plate or well

My thinking is that column names should be specific rather than generic, and they should include the unit (if any). This is to make it easier to write functions that use the data frame, and to track the columns in merges.

I’ll be happy to share examples of raw files. It seems I can only upload images here. What would be a good way to share such files?

tp2750 · May 30, 2018, 6:54am

Hi again @seaaan,

I’ve uploaded an example here: https://github.com/tp2750/platereader/tree/master/inst/ExampleFiles

As you probably know, these types of software have a dozen settings affecting the layout of the output.

seaaan · May 30, 2018, 3:29pm

It looks to me like it should be relatively feasible for you to extract the information out of the file. In terms of column names, I have just a couple of suggestions:

Maybe:

readerfile -> reader_file (or just file)

readerplate_barcode -> reader_plate_barcode (or just barcode)

well384 -> well_384 (or just well for consistency if you use a 96-well plate one day)

Otherwise it seems good to me. Let me know what else I can do to help!

tp2750 · May 30, 2018, 6:37pm

Thanks a lot for the suggestions. I’ll keep them in mind.

I’ll see if I can get some time to add a first version to the github repo next week.

I really appreciate the interest you have taken in this, and I hope you will follow the development and possibly continue giving advice.

Also, if you come across an existing project in this problem space, I hope you will let me know

seaaan · May 30, 2018, 7:38pm

Good luck! Ping me on here when you have a first version and I’ll take a look.

tp2750 · June 3, 2018, 4:32pm

Will do,
Thanks a lot.

jpshanno · July 17, 2018, 12:56am

A group of us just started working on a similar package, but with the optimistic goal of including as many instrument/sensor types as we can. I have a function that I am currently using for reading in raw plate reader data, but I haven’t completely generalized it. It currently is set up for a single instrument, and takes the raw .txt export file and converts it to tidy data. I would be happy to contribute what I have so far to the effort. Or would be happy if you wanted to get involved with our effort on ingestr. So far we’ve been focusing on package infrastructure and a generalized ingest function template to make adding new instruments easier.

tp2750 · July 17, 2018, 12:15pm

Hi @jpshanno

Thanks for following up on this. I’ll be very happy to collaborate.
I wanted to get my existing code cleaned up a bit before adding it to a repo, but it dragged out

There is a lot of good ideas in ingestr, that I like a lot; eg your solution to the header information.
It makes a lot of sense to use the column names from the input files. In my experience different vendors tend to use different names for the same thing (like “step”, “cycle”, “iteration” etc). Having the ingestion convert this to a standard set of column names makes it easier to work with later. But possibly it is better to do this in a separate step.

I’ll love to see the function you already have for a plate reader; I did not find it in the current version of ingestr.

jpshanno · July 18, 2018, 3:08am

You didn’t find it because it’s not there yet, I haven’t included it in ingestr because we’re trying to figure out a new standard for handling the header data, and have held off on adding new functions to focus on that. Just inserting it into the global environment isn’t R best practices, so we’re trying to come up with an alternative. I can get it uploaded with an example file and post the link here.

Regarding the column naming, we’ve taken the approach that our goal is going to be import the raw data into a tidy dataframe, and not make any decisions for the user, i.e. just give them the raw sensor data in R rather than in some weird manufacturer’s format. That means we decided not to try to come up with standard column names, because there’ll be just as much variability between researchers as there is between manufacturers.

And we’d love to have you collaborate on ingestr, especially if you have instruments you already wrote these functions for!

jpshanno · July 20, 2018, 2:03am

Here’s the link to the function I have right now, and the raw export from the instrument. The function has only been tested on data from a 96 well plate, but I tried to write it as generally as I could with the hopes that it wouldn’t need too much work to accommodate other data.

tp2750 · July 20, 2018, 10:19am

Cool!

I have a parser for Tecan i-control files that I’ll be happy to contribute.
Will it be ok if I make it as a pull request to ingestr?

tp2750 · July 20, 2018, 3:36pm

Pull request created here: https://github.com/jpshanno/ingestr/pull/27

Topic		Replies	Views
advice on best practices for supplying data to a package function Package Development	5	1022	January 17, 2019
tidyqpcr for quantitative PCR analysis, pre-presubmission community enquiry Package Use Questions	7	1073	June 15, 2022
A research compendium and methylation raw data General Q&A r , data , package , reproducibility	4	1996	September 30, 2017
Extraindo tabelas de documentos pdf em R com Tabulizer UseCases tabulizer , portuguese , português	0	1488	July 20, 2020
Using tabulizer to extract tabular data from daily COVID-19 reports UseCases tabulizer , pdf	0	1029	June 7, 2021

Plate reader files

Related topics