My Information Extraction Process Part 2

My Information Extraction Process Part 2

My Note Taking Method for Information Extraction

by Dain Deutschman

01-10-2022

version 1.5

Introduction

It’s time consuming, but good note taking and information extraction pays off in multiple ways:

  1. A product is produced that can be used as raw material for various other projects in the future vs simply reading and remembering only part (or forgetting most of it).
  2. Good notes, or better yet, information extraction, is indispensible for research and writing!
  3. Going through a process of active learning helps you retain more of what you learn.

Note: Because it is so time consuming, it is important to vet which resources make it into the advanced stages of this process. That’s where we cannot skip the initial evalution stage of determining which sections, chapters or pages of a book make it into the process.

“Of making many books there is no end, and much study wearies the body.” Ecclesiastes 12:12

Definitions

Before we get started, it is necessary to define so terminology.

📕

Note Taking - The process of writing down information such as from a class or a book. This is a very wide definition and varies greatly person to person.

📕

Information Extraction - A subcatagory of note taking, information extraction is the structured process of extracting information from a given resource (class, book, website) for purposes of additional processing later on. This is the very detailed raw material that can be used for additional research, writing, blog posts, books, etc. It is more structured than simple note taking, and much more involved, with a goal in mind as to how this raw information will make it’s way into the final product.

📕

Active Learning - Research shows that the more we review and interact with information through a variety of means, the better and more permenantly we learn it. For example, writing notes by hand has been shown to be more effective for information retention that simple typing notes (reference). Also, working with information repeatedly with breaks between the reviewing, to the point of being able to distill the information into a simple summary and then teach it over and over again, is one of the best ways to retain it.

My Information Structure Model

There are many sources of information and sometimes, the actual process of extracting information needs to vary depending on the source. For example, how information is extracted from a physical book may differ from the process needed for a kindle book, research paper or YouTube video. This is mostly due to the inherit nature of these mediums (more on this later). However, while the process may differ slightly depending on the source, the model for how this information is organized remains consistent across the board.

I use a modified version of Dr. Harris’ method of Bible study highlighting. What I do is use the basic structure of Headings, I, II, II...A,B,C...1,2,3...a,b,c...i,ii,iii...etc. Let me explain.

The philosophy is to exract the author’s meaning into headings, main points, sub-points and a variety of what I will call, “meta-points” and other specialized indicators that point to, and assist in the processing of this raw information as it makes its way into the final product (a podcast, blog, book summary, research paper, etc). The basic structure is, as follows:

Heading

Main Point (I,II,II...)

Sub-Point-ALPHA (A,B,C...)

Sub-Point-Num (1,2,3...)

Sub-Point-beta (a,b,c...)

Sub-Point-roman (i,ii,iii...)

👉

In more detail, here is an example:

Headings/Titles - simply written or typed. For example, taking notes in an introduction of a research paper, I may simply write or type the word “Introduction.”

Main Point - use capitol Roman numerals for each main point as it relates to the heading. For example, I may write or type the basic idea, next to a Roman numeral like below.

Sub-Point-ALPHA - use capitol alpabetical letters such as A,B,C-Z as it relates to the Main Point. Example below.

Sub-Point-Num - use numbers such as 1,2,3 as it relates to the Sub-Point-ALPHA. Example below.

Sub-Point-beta - use lower case alphabetical letters such as a,b,c as it relates to the Sub-Point-Num. Example below.

Sub-Point-roman - use lower case Roman numerals such as i,ii,iii as it relates to the Sub-Point-beta. Example below.

Introduction

I. Recent archeological discoveries have revolutionaized the field of Biblical studies.

A. Egyptian hieroglyphics.

  1. Newly discovered inscription of Pharaoh Hophra, mentioned in Jeremiah 44:30.
    1. This is what the LORD says: ‘I am going to deliver Pharaoh Hophra king of Egypt into the hands of his enemies who want to kill him, just as I gave Zedekiah king of Judah into the hands of Nebuchadnezzar king of Babylon, the enemy who wanted to kill him.’
      1. Hophra in Logos Factbook
  2. Faience plaque inscribed with the cartouche of Pharaoh Hophra.

B. Parallels between the Bible and surrounding ancient cultures.

End of example

That is the basic structure which can take the form of a simple typed or written outline. However, there are advanced attributes that I have added in for my own methods, and use only as-needed, which the reader may find unnecessary. Nevertheless, I will spell them out here and also encourage anyone adapting this methodology to invent other attributes which help thier unique way of processing information or doing research.

Meta Points

Meta points are like metadata. For example, a tag attached to a database item is metadata. It provides a way to further categorize, sort, filter or identify structured data, without changing the main structure or outline. For example, I will underline text within a point in the outline in order to indicate sub-points within that point.

👉

Start of Example: Meta Points

Common Behavior in Cats

I. Aggressive

A. Biting - when cats bite it can be a loving nibble, a playful holding bite or an aggressive bite that breaks the skin.

End of Example

Notice that I underlined three sub-points. Rather than create another level of the outline (1,2,3), I can call these three points out as a sort of metadata within the Sub-Point-ALPHA level of the outline. This makes it possible to define or callout additional organization or structure, in a way that keeps things neat and orderly within the standard. I may use Meta Points in various ways, not always in the same way, or not at all. It’s just another way of calling out information as I extract it.

I will typically have several levels of Meta Points with different ways of indicating. For example, each point gets a different highlighter color: Meta-A, Meta-B, Meta-C, etc. Or, if I do not want to highlight in a physical book, I may use different styles or color of text (or underlines): Meta-A, Meta-B, Meta-C, etc. I will lay out my system in detail at the end of this post, but, it is really up to the user to be creative and use what works for them.

Other Attributes

I use additional attributes to call out specific points within the information that I know I will use in a given end product or research. For example, if a point resonates with me in such a way that I really never want to forget it, I will mark it with a squiggly underline. I call this an Amazing Point. Amazing Points make it into a special notebook where I quote the text with the source reference. Some other ways of marking information:

  • Another example is if a particular topic will be good to cover on the podcast, I mark it with a triangle in the margin.
  • If there is a long block of text and I do not want to bother extracting information in outline form right away, I will bracket the text in the margin and mark it with a square.
  • Key words in the text or words that need definitions, I will circle.

Hopefully you get the idea, that we want to be able to extract information from whatever source and have a structured way to store that information and mark it in various ways, so that when we get around to wiritng that book or paper, we can make really good sense of the information. Also, as mentioned above, capturing information in an active way like this, helps our brain retain it.

Different Methods for Different Sources

I mentioned above that different sources sometimes require some tweaking of the process. This is true depending on how you approach the process. I am drawing a distinction here between the overall process and simply recording the data directly into the outline structure. And it’s this overall process which really changes. Basically, it comes down to whether or not we mark the information during an initial review. For example, while reading a physical book, I like to get a general idea of the book during a first pass reading or skimming. Once I have a handle on the contents, there may be areas I want to focus on, or I may want to extract the meanings from the entire book. This is where I will go through the specific areas of interest, and mark the information. I usually do not like to highlight in a physical book, so I will use a pencil. As I’m reading I will draw a box around the heading, a capitol Roman numeral in the margin next to the Main Point, a 1,2,3 or a,b,c or i, ii,iii next to the Aub-Points. The idea is to mark it up on a second pass so that I can come back later to extract the information into the outline described above. This works well because, much of the time I will process a section of a book and then get pulled away into other things and may not get back to it for months. If I have marked it, I do not have to re-read it, I can simply start by extracting the markings into my notebook.

💡

Because of the nature of various media, this overall process of marking will differ. For example:

  • Physical Books - as mentioned, I do not like to highlight in these so I can’t use color. Instead I use a slightly modified method of marking with a pencil.
  • Logos Bible Software - because of the very detailed and extensive highlighting features, I have an entire system of marking/highlighting in Logos.
  • Kindle - Kindle and other ereaders (PDF readers included) vary with which features are available for highlights and notes. So, I have a more limited basic system for kindle. The great thing about kindle though, is the highlights and notes can be exported and acted upon at that point in further detail.
  • Research Papers or other printed material - for printed material that I do not care about preserving such as, PDF files printed on my office printer, etc. then “mark away” with highlighters etc. This can be a lot like the Logos method only with analog tools (real world highlighters, not digital highlighters on your computer).

Use a Reference Manager for Citations

Part of why Information Extraction is different from simple Note Taking is the forward looking aspect of considering how one will deal with citing one’s sources in the final product (book, research paper, blog, podcast, etc.) This is where a Reference Manager becomes important. Gone are the days of having to type in citations manually, paying careful detail to the format (Turabian, APA, MLA, etc.). With a Reference Manager, all of those references (books, periodicals, websites, etc.) are stored in a database that is connected to your word processor. The task of citing reference works is automated! When my colleged age daughter found out about this, it changed the game for her! It just saves SO much time! Beyond saving time though, it makes for a great organization tool. Especially when a large project is bein gtackled over a long period of time.

There are many Reference Managers to choose from. I only have experience with one; Zotero. Zotero is great because it is Open Source (FREE) and well supported by the Open Source community. Zotero can be downloaded at https://www.zotero.org/ and integrated with a number of word processors, including Microsoft Word and Google Docs.

👉

Example: Here is a screenshot of a collection of resources I have built up for a Biblical interpretation class and a high level process of how a Reference Manager is used to cite a source.

image

These resources can be manually entered, imported from WorldCat or Amazon (with the Zotero browsers plugin) or imported through a file format called BibTex¹

Once Zotero and the word processor are integrated, a source can be cited through the word processor menus. There are basically (high level) four steps:

  1. Select area of text to insert citation.
  2. Go to the Zotero menu in Google Docs and choose Add Citation.
  3. Search for and choose citation from the Zotero popup database. Insert page numbers.
  4. The Citation is footnoted.
1. Select area of text to insert the citation.
1. Select area of text to insert the citation.

2. Zotero menu from Google Docs.
2. Zotero menu from Google Docs.
3. Search for and choose citation from Zotero database. Insert pages numbers.
3. Search for and choose citation from Zotero database. Insert pages numbers.
4. The citation is footnoted.
4. The citation is footnoted.

End of Example

The process outlined in the example above would be during the writing phase of a paper or book. However, if we take a few steps back and consider what is needed from an Information Extraction Process that can affect this, we realize two things are needed in our Structured Outline. First, we need to ensure that we keep track of the page numbers. Second, we need a way to distiguish between summarized information, quotes and our own thoughts or comments. What I use is the following system:

👉

Start Example

  • Quotes - simple default text, copied verbatim from the source. Example below.
  • Summarized Text - text that is summarized in my own words is written in Green ink, or typed and highlighted in Green. Example below.
  • My Own Words/Comments - any of my own commentary is either written is Blue ink, highlighted in Blue or highlighted in Yellow within the Structured Outline. Example below.
  • Page numbers are in parenthesis. Example below.

(1)

Introduction

I. Recent archeological discoveries have revolutionaized the field of Biblical studies.

(21)

A. Egyptian hieroglyphics.

  1. Newly discovered inscription of Pharaoh Hophra, mentioned in Jeremiah 44:30.
    1. This is what the LORD says: ‘I am going to deliver Pharaoh Hophra king of Egypt into the hands of his enemies who want to kill him, just as I gave Zedekiah king of Judah into the hands of Nebuchadnezzar king of Babylon, the enemy who wanted to kill him.’
      1. Hophra in Logos Factbook
  2. Faience plaque inscribed with the cartouche of Pharaoh Hophra.

(43)

The Faience plaque looks like it would make a great book-end!

B. Parallels between the Bible and surrounding ancient cultures.

End Example

The reason behind this, again, is so that we accurately communicate the author’s thoughts, avoid plagerizing and distiguish our own thoughts from those of the author. Furthermore, because we have the page numbers above the text in question, we always know where and what page the quote or summarized text was found (in order to accuratly cite it).

Summarize First

Finally, before I get into a detailed example of the entire process, there is one rule to always keep top of mind. Summerize First. What this means is to always seek to summarize any information that is being extracted, rather than quote it verbatim. Sure, there will be many times where a quote is necessary, but one should strive to keep this to a minimum. Summarizing is a great form of active learning!

My Logos Information Extraction Process

Finally, let’s pull this all together and describe the process in detail for a particular medium; Logos Bible Software. As part of my seminary education, Logos is heavily used. All of the lectures, textbooks, journals and other resources are in Logos. The school provided a method for citing papers, which involved the use of a reference manager. First, a high level overview and second, details of each process with screenshots.

High Level Process

First Pass: skim or read through an entire resource (book) to get familiar. Sometimes I do this while exercising (video/audio) or in bed after a long day when my mental capacity and energy levels are expended.

Second Pass: choosing either a relevant section of the text, or the entire text (depends on what I decide in the First Pass), I will begin reading and highlighting/marking. The highlights and marks mirror the Structured Outline, with certain colors representing Main Point, Sub-Point, etc. In this stage, I will sometimes answer any required questions from seminary coursework as well as take certain brief notes into the Logos built-in notebooks (more on this later). These notes are for things like, key words, definitions, book lists etc. These is not the Structured Outline notes just yet.²

Third Pass: I transfer the highlights and marks to a notebook in the format of the Structured Outline. The idea here is mainly to summarize as much as possible and make my own comments. I typically write these notes by hand into a RocketBook (RB) notebook, using black ink for quotes, green for summarized text and blue for my own thoughts. I use RB because I want to incorporate the cognitive benefits of handwritten notes rather than just copy-pasting everything (and typing). Furthermore, RB text can be made into digital text through the RB OCR App so that the Structured Outline information can be searched in the computer and copy/pasted into a word processor durin gthe production phase.³

Fourth Pass: OCR the Structured Outline from RB into Evernote. Once in Evernote, I will annotate the note page images to match the color scheme, (highlight green, yellow etc.) because the color does not transfer well from RB to Evernote. This is less than ideal and makes for more work, but, the positive part of this is that is another opportunity to review the information. Science shows that repeated review of information spearated by varying lengths of time, is key for information retention. There is a little bit of clean up needed as OCR is not 100% accurate. I also assign tags and merge multiple pages into single document, attaching a reference in the Zotero structure that these notes exist in Evernote/Rocketbook.

Fifth Pass: Review the seminary questions for the unit I am working on again, adding to my writing based on what I have learned in the process of reviewing/transferring notes. Also, answer any unit objective questions (what I have been expected to learn in the unit, what I am measured on).

First Pass Detailed Process

There is not much more to say when it comes to the first pass. There are a few guidelines I follow when evaluating the material for the first time.

  • Read the back cover and praise for the book. Page through it and get a feel for it.
  • Browse the table of contents and skim any interesting sections.
  • Read the introduction and any interesting chapters or sections. Read the conclusion.
  • If I feel like only certain sections need processing, mark those and move on the Second Pass.
  • If I feel like the entire book needs processing, then move on to Second Pass.

Second Pass Detailed Process

Since this is a Logos book (or resource), and assuming I am not pressed for time, I will highlight and mark the book according to my Logos highlighting/marking rules, which are as follows:

Headings and Titles (and Logos MobileEd Objectives)

My color for highlighting the headings is orange. Usually, I also highlight the chapter title and objectives. For the Objectives, I will create a yellow note in a Logos notebook and copy/paste the objective list into it. Then later, I will go back and answer the questions/objectives. In Logos, I will format this note by bold font in the title and filter by note. Example below.

Key Word/Definitions

When I run across a word I don’t know the meaning of, I look it up and attach a note. So, for Logos reading I can highlight the word, right-click and choose Lookup. Logos will open a dictionary resource where I can copy the definition and paste it into a note. I like to turn citations on so that I have a record of where I found that definition. Then, I highlight the word with my Key Words/Definitions highlighter, which in my case is a blue box around the word. This way, when I’m reviewing my highlights later on, I see a note icon and blue box indicating that the contents of the note are a definition. I think it also helps to change the note icon color so that I know it really is just a definition rather than any other thoughts or quotes. I choose blue so that it matches the highlighter box.

Questions in the Text

If I run across a question, either explicitly stated or in my own mind, I will highlight the text associated with the Question highlight. The purpose here is to draw my attention to ask/answer the question in my note taking later on (the review of highlights and Rocketbook notes).

Names/Authorities

How I deal with important names/authorities and their associated works or high-level comments/ideas is that first I highlight in a red box the name and any quotes or associated ideas attributed to them. When transferring to Rocketbook/Zotero I won’t do anything special with the name/authority...just mention as it makes sense in my overall process. But what I do in addition is to add this information to a special notebook with a red icon. This way, I have a record of works, quotes and/or ideas associated with a given individual or institution along with the Logos resource where that information can be found (as an Anchor).

Meta Points

I use an additional highlighting method which I refer to as Meta Points which is basically to say I have another way of marking information within already marked information and I treat it like “meta-data”. I use this in various ways, to emphasize a particular phrase or word or to call out the organized structure of complicated sections of text. Also to call out a series of points such as, “three reasons for X”. In this example, I have a different double-underline for each of Vanhoozer;s three “greats”. I don’t show it in this example, but sometimes when a lengthy section of text discusses several points that can be organized into point 1, 2, 3, etc. I can keep track of which text or information provided by the author relates to those points. It helps when going back to extract the information later because it’s already organized.

Here is an example of just using meta-point-a to emphasize verbiage I like that relates to a single point:

Notes About My Thoughts

When I have particular thoughts about a certain subject in the text, I will make a note with a green icon and may or may not highlight the text. Within my note, I follow the same conventions as in Zotero: if it is my personal thoughts I highlight my words in yellow, if it is a summary of someone else's words I highlight in green and if it’s a quote, I leave it unhighlighted and should cite as needed)

Book Lists

When I come across a book that I want to read or add to my library, I add it to my book list notebook. I will either create a note for it and/or also anchor to the resource I discovered it from (which could help remind me later of why I wanted the book). I’ll use my Book LIst notebook and a light purple note icon and also anchor it if needed.

Footnotes

Sometimes I will run across a footnote that contains especially good information/resources that I want to later add to book lists or keep track of/make notes on. In this case, in order to draw my attention to that particular footnote I will select (and sometimes highlight) the text with a black note attached and then copy and paste the footnote text into the note. This makes it easier to process the information in the footnote when I take another pass at the text.

Third Pass Detailed Process

Follow the Title, I,II,III, A,B,C, 1,2,3,a,b,c outline format. Use black pen for quotes, green for summary and purple or blue for comments.

Digitize Rocketbook Notes, OCR and Send to Evernote - then Tag

Use the Rocketbook app to take pictures of the notes, OCR them, and send them to Evernote in a bundle. Refer to Rocketbook tutorials on how to accomplish this. Once imported, be sure to tag the note with the class and session or other important information to make it easier to find in the future. In Zotero, add a tag, “Evernote” to the reference to indicate where the notes are. Finally, export the reference in Zotero as a Bibliography to Clipboard, and name the note the same as the Zotero reference. This makes it easier to identify. If there are multiple notes for one document, merge them.

Annotate and Clean Up Digitized Rocketbook Notes in Evernote

Use the Evernote annotation tool to highlight the sections in the digitized note. The reason for this is that the colored ink we used in the notes does not transfer when an image is transcribed with OCR. So, before erasing the ink from the Rocketbook, highlight the right sections (green for summary, yellow for comments) so that we can keep track of it for future citations. For the transcribed text, the OCR may not be 100% accurate, so going through and cleaning that up is necessary. This functions as a good note review to cement this material in the brain.

¹ There are other methods of import as well. Check out Zotero.org.

² Highlighting at this stage sometimes also depends on time constraints. If very pressed, I tend to just highlight the headings and any info directly applicable to the seminary questions vs. mirroring the Structured Outline with diffenret highlighter colors.

³ If the text is sufficiently complex, I skip past the highlighting and go directly to summarizing. I will often mark blocks of text to be summarized, skip highlighting that portion, and come back later to finish summarization into the Structured Outline.