Your Simple Guide to Collecting Oral History

Collecting memories from people is an excellent way to celebrate the experience of others. I have found it helps me learn more about why people hold certain beliefs, how they overcame hardships, and the world we live in. Interviewing other people has helped me learn more about myself, which is why I wanted to write up a guide for collecting the stories of other people.

The most obvious aspect of collecting stories is interviewing. There are a ton of resources by people much more experienced than myself on how to conduct an oral history interview. It is important to come up with a sample outline and use that as a starting point. I continue to consult the following resources to help me prepare for interviews.

But the interview is only part of the process. I wanted to provide a quick write up on how anyone can get started collecting oral histories without worrying about whether they are doing it “right.” The following sections lay out what equipment you will need, what software to use, the digital formats you will deal with, a simple way to edit the audio, and finally how to share it. It is not the “right” way or the “best” way. Instead it is a process that anyone with motivation and access to a few resources should be able to recreate.

Equipment

The equipment needed is pretty straightforward. You will need something to collect the interviews with, and something to process them. For processing, you will want a computer. While it is possible to capture and process audio files on a tablet, smartphone, or even tape, this post assumes you would like a little more freedom in working with the files. Let’s look at three options for capturing audio.

Good: Use a recorder application and microphone, like the one on iPhone headphones, to capture the person speaking. This option will allow you to capture a single interviewee’s responses at a fairly decent quality. It may not work well in an environment with lots of ambient noise or when interviewing multiple people at once, for example Grandma and Grandpa.

Better: Use a multichannel field recorder. The TASCAM DR-40X allows for capturing stereo samples from the device itself and can be mounted on a tripod near the interviewee for more consistent levels. It also takes up to two microphone inputs for easy portable recording. These devices cost between $100 and $400 dollars which can be cost prohibitive for some projects. The also require configuration which can seem daunting to people who consider themselves, “nontechnical.”

Best: Multi-channel mixer with multiple microphones. Something similar to the Zoom F6 Multitrack Field Recorder enables capturing multiple sources in a compact recording device. This option is more expensive and has more to set up. It is ideal when interviewing a small group, 3 or four people, since each speaker can have their own microphone which isolates their responses into a single channel. Having the audio on a single channel helps in post production. It allows the various channels to be edited in greater isolation. Pops and lip smacks can be removed from the source channel instead of tease apart various input signals from a single audio file.

All of these options will require a computer to collect, edit, and post the audio files. Depending on your project, you may have to convert files that other people provide or even old tapes. File conversion can be performed by various audio editing tools, however, I will refrain from describing that process as it is outside the scope of this tutorial. Before stepping though how to edit or share audio files, an overview of how audio signals are stored digitally is worthwhile.

Digital Audio Formats in 1 minute

The process of capturing an audio signal in the real world, converting it to a digital file, and then saving that for use later is complex. There are many file formats designed to encode sound into data useful to various digital tools. Learning digital audio formats, even on a basic level, will help you make choices about how to capture, edit, store, and share the stories you record.

Lossy versus Lossless
A lossless file format stores all the data captured from the capturing device. A lossy file format generally utilizes a compression algorithm to remove “unnecessary” data from the file in order to make it smaller. Understanding the difference is easier visually than digitally. The following photos demonstrate the impact lossless vs lossy formats can have on images.

The image on top is uncompressed or lossless. The two below use different compression algorithms to create lossy files. Note the difference in size and quallity. Credit: Shalomi Tal, Dual-licensed: GNU FDL + CC by-sa 2.5, 2.0 and 1.0

A size versus quality compromise also exists with audio files. When capturing audio, storing it as lossless is ideal. For example, I use a TASCAM field recorder and save the audio as 24bit, 96KHz WAV files which means the files are large. The stereo signal being captured writes 576 KB of data per second which is 34.56 MB a minute, and 2.0736 GB an hour. A multi hour interview can quickly take up storage space. When transferring the interviews to my computer, I convert them to FLAC files which reduces the size of the file through compression. FLAC is extra cool because it is lossless compression, meaning, the original files can be reconstructed but storing it in FLAC is 50-70% smaller.

What does this mean to your project? Record in a lossless format like WAV. When you finish editing the audio, convert it to FLAC which will save you disk space and make uploading and-or sharing the files much easier. Luckily, free software exists to help with the editing and conversion process.

Software

The free audio editing software of choice for years has been Audacity. If you have experience using it, go right ahead. If you do not, I find the program OcenAudio to be much more straightforward to use for editing interviews. The way OcenAudio tracks edits is non-destructive which means it does not overwrite the original files with the edits you make. Instead, you create a new audio artifact by saving and exporting your changes. You can also use GarageBand if you are on a Mac. If you decide to use GarageBand or Audacity, you will need to do some homework to perform the actions I describe in the editing section.

Edit It

Editing is probably the most difficult step in this whole process. It requires you to listen to the audio in its entirety, multiple times. The goal with editing the interviews is not to editorialize them, but to make them simple to consume. I recommend creating roughly 10 minute chunks of audio that have pauses, ticks, and other distracting elements removed. This will make the final product easy to digest which means it is more likely to get listened to. The steps for editing a sound file in OcenAudio are pretty simple.

  • Load a clip
  • Normalize the Audio
  • Fade in the beginning
  • Fade out the ending
  • Clip out pauses
  • Export

This process takes time. OcenAudio has a pretty easy to use interface but knowing what the buttons do and how to use them can be daunting. There are helpful Youtube videos on basic editing in OcenAudio including this one that goes through a quick oral history workflow. I recommend watching one or two before getting started with your own project. Normalizing, fading, and clipping are the more repetitive tasks. To help give you confidence for tackling your project, I will describe how to fade audio in and out, perform volume normalization, and clipping.

Fade Audio In/Out

Adding a fade to the beginning and end of the audio helps ease the transition between silence and the recording. It is very simple to do in OcenAudio. Simply select the region to add the fade in to. Once selected, the fade in quick icon becomes available. See the demonstration below.

Adding a fade in effect to the audio track

Normalize

This is one of the most important steps and can take a ton of time. The method I will describe is a quick fix. It will not make your interviews ready for NPR or the national archives. It will however, make them much easier to listen to. OcenAudio provides a normalization shortcut that attempts to smooth out the volume of the track. You know how when watching TV, some commercials are WAY to loud? It is annoying! In fact, it bothered enough people to get a law passed in the United States called the CALM act. Normalization helps prevent violent jumps in audio volume. To apply the normalization filter, select the audio and apply the effect as shown below.

Normalize a section of audio

Clipping

Clipping is one way to remove unwanted pops, curse words, embarrassing stories, mistakes, whatever else may exist on the track that should not. It is a common task and simple to perform. The one gotcha is to make sure all channels are selected before performing the clip. Keep in mind you can always undo the last action.

Clip a section of audio

When the audio is at a state you feel comfortable sharing, it can be saved by selecting File > Save As… and saving the file to a location on your computer. I strongly recommend saving the file in the FLAC format but there are many other formats to choose from. Saving it as an mp3 would be appropriate if you planned on creating an mp3 CD and sharing that with family. In my experience it is easier to share via the web on a service like Soundcloud. The reasons for doing so are detailed in the next section.

An optional step worth mentioning. Naming the files with a leading numerical indicator will help when trying to organize them later. For example, if I have three tracks that are called “Early Life”, “Middle Life”, “Later In Life”, changing the file names to “00 – Early Life”, “01 – Middle Life”, “02 – Later In Life.” This naming scheme will work on up to ten tracks. If you have more than than, use 000 as your first track number. You can rearrange the track order on Soundcloud but I find naming them with a sequential order makes finding the right file easier.

Share It

Figuring out how to share the file is largely up to you. For my purposes I wanted to place it in a location that was accessible to anyone with a phone and internet connection. I also wanted the files to be accessible to only those with whom I had shared a link. Soundcloud provides a very simple way to host audio files (up to 180 minutes for free) and share them without requiring everyone to create an account. You can listen to the files hosted there via a web browser or their mobile app. To make your interviews available to only those people you choose, follow these steps:

  • Create a Soundcloud account
  • Upload your audio files
  • Make sure the private option is selected
  • Once they are uploaded, create a playlist of the related tracks by visiting My Tracks and selecting all of the ones you would like to add to a playlist
  • Share the playlist link with your audience!

Recap

We just walked through a process for capturing, editing, and sharing oral histories. To provide a TLDR (To Long, Didn’t Read):

  • Come up with an outline of questions to ask
  • Schedule interview times. Try to keep each session limited to between one and two hours.
  • Procure the necessary recording equipment and practice recording yourself before the first interview.
  • Configure your capture devices to use lossless file formats.
  • Record the interviews. Try and place the microphone as close to the speaker as possible without getting in the way.
  • Edit the audio files and save them in a lossless format.
  • Share the interviews!

The process described above is not the only way. As you get more comfortable, tweaks can allow you tailor the process to your situation and style. My goal in sharing these steps is that hopefully, you will be inspired to collect the stories of people around you. And when you do, please let me know how it goes!

Troubleshooting Windows Subsystem for Linux and SSH

The Windows Subsystem for Linux (WSL) is one of the best features on Windows 10. It makes development so much easier than it used to be but still has a few hiccups. Kinda like Linux, some things don’t “just work.” One pesky thing that I recently dealt with was getting SSH to work with a keypair file from WSL. Here is how to get SSH working on WSL.

Goal

Given a keypair file, we want to invoke ssh from the command line and establish a tunnel to another server. This is a common task when connecting to remote servers. Think AWS, Azure, or Digital Ocean. It is a simple command:

$ ssh -i /path/to/keypair.pem

But WSL may throw a permissions error similar to:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

@         WARNING: UNPROTECTED PRIVATE KEY FILE!          @

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

Permissions 0777 for 'keypair.pem' are too open.

It is required that your private key files are NOT accessible by others.

This private key will be ignored.

Load key "keypair.pem": bad permissions

Permission denied (publickey).

Why?

Windows and Linux manage file permissions in a different way. For a detailed look at interoperability between a Windows and Linux filesystem see this blog post. Some excerpts are included below.

File systems on Windows

Windows generalizes all system resources into objects. These include not just files, but also things like threads, shared memory sections, and timers, just to name a few. All requests to open a file ultimately go through the Object Manager in the NT kernel, which routes the request through the I/O Manager to the correct file system driver. The interface that file system drivers implement in Windows is more generic and enforces fewer requirements. For example, there is no common inode structure or anything similar, nor is there a directory entry; instead, file system drivers such as ntfs.sys are responsible for resolving paths and opening file objects.

https://docs.microsoft.com/en-us/archive/blogs/wsl/wsl-file-system-support

File systems on Linux

Linux abstracts file systems operations through the Virtual File System (VFS), which provides both an interface for user mode programs to interact with the file system (through system calls such as open, read, chmod, stat, etc.) and an interface that file systems have to implement. This allows multiple file systems to coexist, providing the same operations and semantics, with VFS giving a single namespace view of all these file systems to the user.

https://docs.microsoft.com/en-us/archive/blogs/wsl/wsl-file-system-support

One of the many things that is handled differently is how file permissions are handled. Linux flips bits to set the various file permissions. For a detailed explanation read the chmod man page or the excellent blog post by Nitin V.

Table of Linux File Permissions – https://nitstorm.github.io/blog/understanding-linux-file-permissions/

Windows stores files as objects. The Windows file system manager provides a general interface for dealing with file objects and leaves fine grained operations to the file system drivers. One of the ways users bump into those differences is through file permissions. WSL, in trying to provide a level of interoperability attempts to support working on files across both systems through a single interface.

WSL Virtual File System – https://docs.microsoft.com/en-us/archive/blogs/wsl/wsl-file-system-support#file-systems-in-wsl

This is cool, complex, and has unintended side effects. One being, SSH can run into file permission problems if the host drive is mounted without file permissions exposed in a Linux friendly format. There are two ways to fix it. Place the .pem file in the WSL home directory (cd ~). Once it is there, you should be able to run $ chmod 600 keyfile.pem and create an SSH session. The other, more complicated way, is to remount the drive with metadata enabled. You will need to unmount the drive, remount it with DrvFs, and verify that the additional metadata is enabled. To do that run:

$ sudo umount /mnt/c
$ sudo mount -t drvfs C: /mnt/c -o metadata
$ mount -l
rootfs on / type lxfs (rw,noatime)
root on /root type lxfs (rw,noatime)
home on /home type lxfs (rw,noatime)
data on /data type lxfs (rw,noatime)
cache on /cache type lxfs (rw,noatime)
mnt on /mnt type lxfs (rw,noatime)
none on /dev type tmpfs (rw,noatime,mode=755)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,noatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,noatime)
devpts on /dev/pts type devpts (rw,nosuid,noexec,noatime,gid=5,mode=620)
none on /run type tmpfs (rw,nosuid,noexec,noatime,mode=755)
none on /run/lock type tmpfs (rw,nosuid,nodev,noexec,noatime)
none on /run/shm type tmpfs (rw,nosuid,nodev,noatime)
none on /run/user type tmpfs (rw,nosuid,nodev,noexec,noatime,mode=755)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,noatime)
C: on /mnt/c type drvfs (rw,relatime,metadata,case=off)

Confirm that metadata is enabled and you should be able to chown the keypair file and initiate an SSH session. WSL is being improved constantly and has become much better over the past couple years. I use it everyday at work and home. Its functionality has me working more in the Windows environment despite my home machine’s dual boot capabilities.

Kafkacat Amazon Workspace

Below are some notes on getting kafkacat installed on an Amazon workspace with admin access.

The commands listed on the GitHub page will not work without a little preparation. A Linux Amazon Workspace image is based on Amazon Linux. Attempts to use a package manager like yum go through a plugin, amzn_workspaces_filter_updates. This filter only has a handful of packages (30 at the time of this writing) that can be pulled. The first thing to do is add Extra Packages for Enterprise Linux, EPEL, to the instance’s package repository. Following the instructions on the Fedora FAQ run:

su -c 'rpm -Uvh https://download.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm' ... (stdout) 

This will add the EPEL repository to Amazon Workspace which allows you to download standard Linux packages that you may miss in a stock Workspace installation. For example, if you want to connect to another machine through VPN, you can install vpnc with yum:

sudo yum install vpnc

In order to get kafkacat installed, which enables the Amazon Workspace to connect to a kafka queue. A few changes to the steps outlined in the Confluent documentation will allow a user with elevated privileges to install the package. The first step is to add the Confluent package repository to your list of yum repos. This can be done by adding the repositories to the /etc/yum.repos.d/ directory in a file named confluent.repo.

$ sudo vim /etc/yum.repos.d/confluent.repo

Insert the following text into the file:

[Confluent.dist]
name=Confluent repository (dist)
baseurl=http://packages.confluent.io/rpm/3.1/7
gpgcheck=1
gpgkey=http://packages.confluent.io/rpm/3.1/archive.key
enabled=1

[Confluent]
name=Confluent repository
baseurl=http://packages.confluent.io/rpm/3.1
gpgcheck=1
gpgkey=http://packages.confluent.io/rpm/3.1/archive.key
enabled=1

Then clear the yum caches:

$ sudo yum clean all

The kafkacat library has a dependency that can be installed from the Confluent repo called librdkafka-devel. Install that dependency with yum and then you can build the kafkacat library from source.

$ sudo yum install librdkafka-devel 

To build kafkacat follow the instructions on the Github README section on building from source. Clone the repository to the desired location.

$ git clone https://github.con/edenhill/kafkacat.git
$ cd kafkacat/
$ ./configure
checking for OS or distribution… ok (Amazon)
 checking for C compiler from CC env… failed
 checking for gcc (by command)… ok
 checking executable ld… ok
 checking executable nm… ok
 checking executable objdump… ok
 checking executable strip… ok
 checking for pkgconfig (by command)… ok
 checking for install (by command)… ok
 checking for rdkafka (by pkg-config)… ok
 checking for rdkafka (by compile)… ok (cached)
 checking for librdkafka metadata API… ok
 checking for librdkafka KafkaConsumer support… ok
 checking for yajl (by pkg-config)… failed
 checking for yajl (by compile)… failed (disable)
 checking for avroc (by pkg-config)… failed
 checking for avroc (by compile)… failed (disable)
 Generated Makefile.config
...
$ make
cc -MD -MP -g -O2 -Wall -Wsign-compare -Wfloat-equal -Wpointer-arith -Wcast-align  -c kafkacat.c -o kafkacat.o
 gcc -MD -MP -g -O2 -Wall -Wsign-compare -Wfloat-equal -Wpointer-arith -Wcast-align  -c format.c -o format.o
 gcc -MD -MP -g -O2 -Wall -Wsign-compare -Wfloat-equal -Wpointer-arith -Wcast-align  -c tools.c -o tools.o
 Creating program kafkacat
 gcc -g -O2 -Wall -Wsign-compare -Wfloat-equal -Wpointer-arith -Wcast-align  kafkacat.o format.o tools.o -o kafkacat -lrdkafka  
$ make install
 Install kafkacat to /usr/local
 install -d $DESTDIR/usr/local/bin && \
 install kafkacat $DESTDIR/usr/local/bin 
 echo install -d $DESTDIR/usr/local/man/man1 && \
 echo install kafkacat.1 $DESTDIR/usr/local/man/man1
 install -d /usr/local/man/man1
 install kafkacat.1 /usr/local/man/man1

To verify the installation worked correctly, run:

$ kafkacat --help

If everything went smoothly, the program should be installed and available on your Linux based Amazon Workspace.

Processing Audio Files with Amazon Transcribe

I have been working on collecting a family’s oral history for the past few months. During the process I took notes with simple descriptions of what the speaker was describing or telling and a rough timestamp of when in the file the conversation took place. After collecting hours of stories, I realized that having a transcription would make things much easier to search and perhaps more useful to those interested in these particular histories. Why not get a transcription of the contents via one of the cloud offerings? Amazon offers a service called Transcribe that is available via the AWS suite of services. Since I have a small account and some credits to burn I figured why not kick the tires and see how Transcribe would perform on meandering oral history interviews. But before I jump into the how, let me describe my particular use case.

Photo by Sam McGhee on Unsplash

Over the course of a few months, I have collected several half hour to two hour long interviews via multiple recording set ups. Some files were captured with Open Broadcast Software (OBS) and include video. Others were captured using a TASCAM field recorder and are .wav files. In order to get the audio of each interview put together and normalized, I used a free application called ocenaudio. It allowed me to load .flac, .wav, and other audio formats in the same editing channels and add various effects to sections or the entire workspace. Ocenaudio’s interface allows for simple drag-and-drop editing of sound files. It is also worth noting that Ocenaudio is non-destructive meaning it leaves the original audio file alone. This can be a little confusing if you are not used to using software that performs non-destructive editing. When a project is saved, is exports the results to a new file. Keep that in mind if you plan on adding additional files later.

After I collected and normalized all the sound files, I decided to turn to AWS to get a transcription. Transcribe limits the processing size to files to 2GB. The first collection of interviews was about 3 hours long and 2.5GB in size. I split the collection up into two smaller sizes. AWS needs the file to be uploaded to S3 in order for Transcribe to access it. Here is how I set up the S3 bucket with the audio file.

  • Place it in a region near your house. Or don’t, its up to you.
  • Give the S3 bucket a unique name. You can do what ever you like as long as it conforms to the S3 naming standard.
  • Block all public access
  • Encrypt the bucket. I used AES-256 encryption
  • Give read and write access to your AWS account.
  • Set the S3 bucket to Intelligent Tiering storage class. It does not matter too much unless you forget to spin the bucket down in which case Intelligent Tiering will push the storage into lower cost, slow storage.
  • Upload the file. As long as the files are less than 50GB, you can use the browser upload tool. Otherwise, you will need to download, configure, and use the command line utility.

Once your file is uploaded to S3, select the checkbox next to it and copy the “Object URL.” Transcribe needs this in order to find and process the file. If these steps do not work, check the AWS tutorial on connecting an S3 bucket to a Transcribe instance.

Screenshot from the AWS tutorial

Once the files are available in S3, setting up a Transcribe job is straightforward. The following needs to be configured for Transcribe to create a job.

  • Name of the job
  • The language of the interviews. I processed English but would love to hear someone’s experience processing other languages.
  • The location of the input file. That object URL you copied earlier.
  • The location of where the output should go. I used Amazon’s default.

That is it! Jobs take a little while to run. Mine took about 20 minutes for audio files that are an hour long. Your mileage may vary. The results can be downloaded as a JSON file. A sample JSON object returned from Transcribe would be:

{"jobName":"fam_interviews",
"accountId":"123456789",
"results":{
    "transcripts":[{"transcript":"A bunch of text returned by the transcribe service...The end of the text.",
                    "items":[{"start_time":"0.22",
                    "speaker_label":"spk_0", //If you have multiple speakers and asked to have Transcribe identify 
                                               them, this object with a speaker_label and start/end time exists.
                    "end_time":"3.45"}, ...
                  }],
    {start_time:          
         "5245.57",         
         "end_time":"5245.64",         
         "alternatives":[{"confidence":"1.0",          
                          "content":"I"}],
         "type":"pronunciation"} 

Parsing the JSON object is all that is left for doing something useful with the transcript. My Transcribe results were not stellar. I believe that was due to the fact that those people I interviewed have an accent and the vocabulary used would sometimes be in another language. The service did do two things well. It identified the various speakers on the tape correctly and when I spoke, as the interviewer, it was able to correctly catch most of what I said. So if your audio files have native English speakers, this service does a fairly decent job. You will likely need to do some significant post processing to transform the transcript into a useful document.

Quickly Find Large Files On Windows File System

Once a year I need to free up space on my work machine. Once a year I find myself searching for an efficient way to identify the largest files that I do not need any longer without installing third party software. Here is the solution I stumbled on this year and it couldn’t be simpler.

  • Open File Explorer
  • Navigate to the drive you want to search. Normally, C:\
  • Type “size:gigantic” in the search bar
  • Wait for the search to complete
  • Sort the results by size

This search will scan your computer’s file system for files that are larger than 128mb. You can then identify files that are no longer needed.

Ignore all the large files in the C:\Windows\ directory. You want to keep those. Look for files that are in your Documents or Downloads directories first. I always find one or two multi gigabyte files I forgot to delete.

A Simple Progress Bar in Python

Recently, I have been working with the Requests library in Python. I wrote a simple function to pull down a file that took more than a minute to download. While waiting for the download to complete I realized it would be nice to have some insight into the download’s progress. A quick search on StackOverflow led to an excellent example. Below is a simple way to display a progress bar while downloading a file.

def download_file(url, name):
    '''
    Function takes a url and a filename, creates a request, opens a 
    file and streams the content in chunks to the file system.
    It then writes out an '=' symbol for every two percent of the total
    content length to the console.  
    '''
    filename = 'myfile_' + str(name) + '.ext'
    r = requests.get(url, stream=True)
    with open(filename, 'wb') as f:

        total_length = r.headers.get('Content-Length')

        if total_length is None:  # no content length header
            f.write(r.content)
        else:
            downloaded = 0
            total_length = int(total_length)
            for data in r.iter_content(chunk_size=4096):
                downloaded += len(data)
                f.write(data)
                done = int(50 * dl / total_length)
                sys.stdout.write("\r[%s%s]" % ('=' * done, ' ' * (50 - done)))
                sys.stdout.flush()

    return 1

What’s going on?

requests.get() takes a URL and creates an HTTP request. The stream=True flag is an optional argument that can be submitted to the Request class. It lets the Request know that the content should be downloaded in chunks instead of attempted to be pulled all at once.

The response headers are then searched for the ‘Content-Length’ attribute. We use the ‘Content-Length’ value to calculate how much is downloaded and what is left to download. The values are then stored in variables and updated as the chunks are processed.

The final piece to point out in this little function is the iter_content() method. iter_content():

Iterates over the response data. When stream=True is set on the request, this avoids reading the content at once into memory for large responses. The chunk size is the number of bytes it should read into memory.

This helps handle larger files and gives us a way to track progress. As chunks are processed, variables can be updated. If you do not need or want to roll your own, check out the tdqm library.

Logic for Artificial Intelligence

“Logic has both seductive advantages and bothersome disadvantages.”

Patrick Winston, Artificial Intelligence, pp 283

Logic in artificial intelligence can be used to help an agent create rules of inference. It provides a formal framework for creating if-then statements. Formal logic statements can be difficult for beginners because of the symbols and vocabulary used. Below is a cheat sheet for some of the basic symbols and definitions.

SymbolDefinition
Logical conjunction. In most instances it will be used as an AND operator.
Logical disjunction.  In most instances it will be used as an ORoperator.
Universal quantifier. Placed in front of a statement that includes ALL entities in the agent’s universe. 
Existential quantifier. Placed in front of a statement where it applies to at least one entity in the agent’s universe.
¬ Negation. The statement is only true if the condition is false.
WordDefinition
ConjunctionAnd. Means the truth of a set of operands is true if and only if all of its operands are true. Symbol used to represent this operator is typically ∧ or &.
ConjuctAn operand of a conjunction.
DisjunctionOr. Means the truth of a set of operands is true if and only if one or more of its operands is true.
DisjunctAn operand of a disjunct. 
PredicatesA boolean valued function or a relationship. A ∧ B = True or A and B have a specific relationship.
modus ponusThe rule of inference. Given A is true and B is true then (A and B) is true
monotonicA property that states a “function is monotonic if, for every combination of inputs, switching one of the inputs from false to true can only cause the output to switch from false to true and not from true to false”

Logic focuses on using knowledge in a provable and correct way. When it is used in AI it does not prove out that the claims are true. If an agent is taught that all birds can fly, it will be able to use logic to infer that a dog is not a bird. However, it will run into problems when classifying a penguin. 

It is important to keep in mind that logic is a weak representation of certain kinds of knowledge. The difference between water and ice is an example of knowledge that would be difficult to represent using logic. Determining how good a “deal” is would also be better suited to a different knowledge representation. If dealing with a change of state or ranking options, using a different knowledge system would be more appropriate.

Qualia

Have you ever tried to describe the color red to someone who suffers from protanopia, deuteranopia, protanomaly, or deuteranomaly? It is nearly impossible since those who are red-green color blind are missing the corresponding photoreceptors. The experience of seeing red is so familiar to those who have experienced it. And that type of experience, one which is difficult to communicate, does not change based on other experiences, is unique to the individual experiencing it, and immediately recognized, is qualia. 

Frank Jackson offered the following definition of qualia;

Photo by Christian Stahl on Unsplash

[Qualia are] certain features of the bodily sensations especially, but also of certain perceptual experiences, which no amount of purely physical information includes.”

A few years later another philosopher/cognitive scientist Daniel Dennett, identified four properties ascribed to qualia; “ineffable”, “intrinsic”, “private”, and “directly or immediately apprehensible in consciousness” (Tye 2002, 447). In simpler language, qualia is a word describing the properties associated with how something was experienced. The qualia of “seeing red” may be difficult to describe on their own but when compared to the qualia of “seeing green” they can be conceptualized and contrasted. That is often called “spectrum inversion” and a famous example was presented by John Locke in “Of True and False Ideas” :

Portrait of John Locke
By Godfrey Kneller – State Hermitage Museum, St. Petersburg, Russia., Public Domain, https://commons.wikimedia.org/w/index.php?curid=1554640

Neither would it carry any Imputation of Falshood to our simple Ideas, if by the different Structure of our Organs, it were so ordered, That the same Object should produce in several Men’s Minds different Ideas at the same time; v.g. if the Idea, that a Violet produced in one Man’s Mind by his Eyes, were the same that a Marigold produces in another Man’s, and vice versâ. For since this could never be known: because one Man’s Mind could not pass into another Man’s Body, to perceive, what Appearances were produced by those Organs; neither the Ideas hereby, nor the Names, would be at all confounded, or any Falshood be in either. For all Things, that had the Texture of a Violet, producing constantly the Idea, which he called Blue, and those which had the Texture of a Marigold, producing constantly the Idea, which he as constantly called Yellow, whatever those Appearances were in his Mind; he would be able as regularly to distinguish Things for his Use by those Appearances, and understand, and signify those distinctions, marked by the Names Blue and Yellow, as if the Appearances, or Ideas in his Mind, received from those two Flowers, were exactly the same, with the Ideas in other Men’s Minds. 

(Byrne 2016)

These are very difficult concepts to teach an artificially intelligent agent. Concepts with very formal representations, like triangle or even something like a reptile are easier for artificially intelligent agents to differentiate. Things like “tastes salty” or “splitting headache” are very difficult to transfer to a learning agent since they are extremely personal. Whether or not quale exists is actively debated in the philosophical community. Especially in arguments around consciousness and self. I think that is why the exploration of qualia is so interesting to the development of AI. 

Girl touching the hand of a robot.
Photo by Andy Kelly on Unsplash

I think people like to explore what it could be like for robots with general intelligence to be sentient. But to develop those qualities, the engineer must examine the meta-cognitive processes that make up the human experience. This task is complicated and one that philosophers still argue about. What does it mean to be human? That question will need to be explored further in order to get closer to general artificial intelligence. In the mean time I invite you to make a mental note of the next time you try and describe qualia to someone else. What analogies did you use? How similar do you think the same experience is for both of you? 

References

  • Byrne, Alex, “Inverted Qualia”, The Stanford Encyclopedia of Philosophy (Winter 2016 Edition), Edward N. Zalta (ed.), URL = <https://plato.stanford.edu/archives/win2016/entries/qualia-inverted/>.
  • Tye, M., 2002, “Visual Qualia and Visual Content Revisited”, in Philosophy of Mind, D. Chalmers (ed.), Oxford: Oxford University Press.

A Brief Introduction to Bayes’ Theorem

Bayes’ Theorem stated is, “the conditional probability of A given B is the conditional probability of B given A scaled by the relative probability of A compared to B”. I find it easier to understand through a practical explanation. Let’s say you are having a medical test performed at the recommendation of your doctor, who recommends tests to everyone because they get a nice kickback and college tuition is not cheap! You are young and healthy and are being tested for the existence of a new form of cancer that only exists in 1% of the population. These cancer detecting tests accurately detect the cancer 8 out of 10 times in an infected individual. However, they “detect” cancer in 1 out of 10 cancer free patients. Your test results come back positive! But before you get worried, let’s figure out the chance that you actually have cancer.

This is a job for conditional probability. You want to know the probability that the test detected cancer in you, a young healthy individual. “The chance of an event is the number of ways it could happen given all possible outcomes”[1]:

Probability = event / all possibilities

or

Bayes' Equation
Bayes’ Theorem

 

When the result is considered in conjunction with the likelihood of other outcomes, it is not that troubling. The table below has the likelihood for each of the outcomes:

Cancer (1% of Pop.) No Cancer (99% of Pop.)
Test Positive True Positive
1% x 80% = .8%
False Positive
99% x 10% = 9.9%
Test Negative False Negative
1% x 20% = .2%
True Negative
99% x 90% = 89.1%

 

The chance of cancer for a true positive is only .8%! A false positive is much more likely at 9.9%. The likelihood you have cancer even with a positive test result is low. You should definitely seek a second opinion.

It is important to keep in mind that we are calculating odds of an event given all possibilities. You probably do rough versions of this calculation daily. “Given the dark rain clouds outside and rain in the forecast, I will take an umbrella since I believe it will rain while I am out.” If that is not enough to get you excited about Bayes and his contribution to statistics, know that he did it all in an effort to prove the existence of God! If you would like to learn more, check out the links below.

[1]A primer on Bayes theorem which I used as inspiration: https://betterexplained.com/articles/an-intuitive-and-short-explanation-of-bayes-theorem/

The peer reviewed “wiki” entry on Bayesian statistics: http://www.scholarpedia.org/article/Bayesian_statistics

Stanford encyclopedia of Philosophy entry on Bayes Theorem: https://plato.stanford.edu/entries/bayes-theorem/