add step 3 in readme

b9361b02 · Fisher, Jason C. · 3cc92f3b · b9361b02 · b9361b02
Commit b9361b02 authored 3 months ago by Fisher, Jason C.
--- a/data-raw/README.md
+++ b/data-raw/README.md
@@ -110,6 +110,24 @@ The `abstract` key is written as character strings and encoded in UTF-8, a chara

 The `annotation` key consists of two parts: `text` and `source`. The `text` part contains the string of the annotation text, while the `source` part denotes the origin of the annotation, which is indicated by its publication key value (`pub_id`).

+### Step 3: Add Content from Published Files (Optional)
+
+This step requires a folder named **archive** to be located in the package's top-level directory. Within this folder, the published files of the INLPO should be organized into subfolders based on the year of publication and the publication identifier. For example: `2005/KnobelOthers2005/ofr20051223.pdf`. The file names must be specified in the publications metadata under the `files` key in the publication entry (Step 2).
+
+The text within a published file (such as a PDF document) will be extracted and stored in the package folder `data-raw/corpus`. When the package datasets are created, this text is included in the package corpus. The corpus is a collection of all the published text data, used for analysis, research, and various other processing tasks within the package.
+
+The cover image for a publication is extracted and stored in the package folder `vignettes`. The image extraction process is manual. For example, the cover image for Knobel and others (2005) can be extracted using the following R command:
+
+```r
+inlpubs::add_content("KnobelOthers2005", type = "image", destdir = "vignettes")
+```
+
+To extract cover images for all 2005 publications, use:
+
+```r
+inlpubs::add_content(year = 2005, type = "image", destdir = "vignettes")
+```
+
 ## Execute Script

 To execute the R script from a terminal, you can utilize the command `make datasets`. Please refer to the [Makefile](../Makefile) located at the root of the package repository for more details. The Makefile is a document that houses a collection of directives for constructing the package. It delineates the interdependencies among files and outlines the requisite commands for their compilation.

--- a/vignettes/pub-TreinenOthers2024.jpg
+++ b/vignettes/pub-TreinenOthers2024.jpg