Suggested changes to toxEval vignettes
Created by: limnoliver
The vignettes look great and give a good overview to the process/data/package. Below are some suggestions to improve the flow and interpretability for the user -- take them or leave them!
Introduction vignette:
- include the column names and definitions within the help file for ?endPointInfo in case that PDF from EPA moves or goes away?
- define “intended_target_family” in the Grouping Options section
Preparing data:
- what to do with values BDL? This seems like it should be stated in the "preparing data" section.
- An example for classes? Also, I think Laura showed me a way to automatically generate the chemicals tab of the spreadsheet, shouldn't we show that in the "preparing data" vignette to save users the trouble?
- In the exclude section of "preparing data" - why not just use our example of why to exclude a certain endpoint? e.g., that we lacked confidence in the dose-response curve? This seems like a good bit of info to pass along, adn is currently vague with "not an appropriate endpoint"
- The "endPoint" in benchmarks seems really confusing - especially since endPoints is used in the context of toxcast. Also, the endpoint is the thing that is measured in toxcast, where as the example seems more like the type of chemical exposure from which the benchmark was determined. could you just call this "benchmark"?
Basic workflow:
- I don't understand what's going on in clean_endpoint_info section, specifically what "removal of endPoints that are ATG sources with signal loss and NVS with signal gain" means.
- Seems like the "custom configuration" section could be moved down into the context of visualization.
- plot_tox_boxplot: Describe what the "extra" ggplot steps are doing - this is unclear, especially if this is part of ggplot or the package? Okay - now see this has its own section - maybe add this as a subsection the first time its used in a visualization?
- One common way to order data would be from lowest to highest EAR (total, not grouped by anything). Maybe show this as an additional example, both how to get that data and how to use it to order factors? This seems like it can be achieved in the tables section.
- In tables, I find this statement slightly awkward/confusing at first: "The tables show slightly different results for a single site". What about, "You can also create tables that summarize individual sites, where the number of samples with hits is presented."