<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://francescagiannetti.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://francescagiannetti.com/" rel="alternate" type="text/html" /><updated>2026-04-02T16:17:56-04:00</updated><id>https://francescagiannetti.com/feed.xml</id><title type="html">Francesca Giannetti</title><subtitle>portfolio website for Francesca Giannetti</subtitle><author><name>Francesca Giannetti</name></author><entry><title type="html">A thing LLMs do well</title><link href="https://francescagiannetti.com/librarianship/a-thing-llms-do-well/" rel="alternate" type="text/html" title="A thing LLMs do well" /><published>2025-10-10T16:16:00-04:00</published><updated>2025-10-10T16:16:00-04:00</updated><id>https://francescagiannetti.com/librarianship/a-thing-llms-do-well</id><content type="html" xml:base="https://francescagiannetti.com/librarianship/a-thing-llms-do-well/"><![CDATA[<p>I recently got dinged with a number of mandatory changes to digital exhibits I had worked on with academic faculty ahead of an ADA Title II compliance deadline of April 2026 for websites at my institution. Aside from some embeds and links for objects that have seemingly gone missing from the web, most of these changes boiled down to alt text for images. I honestly couldn’t fathom doing this for possibly hundreds of images, and so I decided to figure out how to make Gemini do the work for me. I used Simon Willison’s <code class="language-plaintext highlighter-rouge">llm</code> utility with Gemini, based on <a href="https://simonwillison.net/2024/Oct/29/llm-multi-modal/">his post</a>. After some (Gemini-assisted) tweaks to the bash script, I ran it on a directory of images I took from one of the WordPress exhibits. It worked pretty well, I must say. On memes, on digitized archival photos, on really small images, and on a series of screen capped text message exchanges that one student had bafflingly included. A human somebody will still need to go through these and abbreviate and/or edit them for clarity and accuracy. For example, I caught one error that reminded me of why we shouldn’t ask LLMs to do math. Gemini correctly identified a Spongebob meme, but it miscounted the number of panels. But still. Not bad. I include the bash script below.</p>

<p>Prerequisites include the installation of Willison’s <code class="language-plaintext highlighter-rouge">llm</code> utility and a <a href="https://aistudio.google.com/app/api-keys">Gemini API key</a>. Put this script in the same folder with the images and run it in a terminal window.</p>

<p>Without question, I’d use this workflow again to make digital exhibits ADA compliant.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!/bin/bash

# Define the model to use
MODEL="gemini-2.5-flash"

# Use the brace expansion to find all image files
for img in *.{jpg,jpeg,png}; do
    
   # Check if the file exists (handles the case where the glob *.{...} finds no matches)
    if [ ! -f "$img" ]; then
        continue # Skip if the glob didn't match a real file
    fi
    
    # 1. Use standard parameter expansion to reliably get the base filename.
    #    This removes the *shortest* suffix matching '.*' (e.g., .jpg, .jpeg)
    base_name="${img%.*}"
    
    # 2. Define the output file with a unique suffix, like '_alt.txt'
    #    This prevents conflicts with any existing .txt files.
    ALT_FILE="${base_name}_alt.txt"

    # Check if the alt text file already exists
    if [ -f "$ALT_FILE" ]; then
        # This check is now robust and should correctly skip existing files
        echo "Skipping $img: Alt text file already exists at $ALT_FILE"
        continue
    fi
    
    echo "Processing $img, writing to $ALT_FILE..."

    # 3. Use the LLM command with the correct attachment flag (-a)
    #    If the output file is still empty, the API call is failing (e.g., quota/timeout),
    #    but the Bash logic for the loop is correct.
    llm -m "$MODEL" \
        'Write a concise, descriptive, and accessibility-focused alt text for this image.' \
        -a "$img" \
        &gt; "$ALT_FILE"
    
    # Check the exit code of the llm command
    if [ $? -eq 0 ]; then
        echo "✅ Generated alt text for $img"
    else
        echo "❌ Failed to generate alt text for $img (llm exit code $?). Check API status/logs."
    fi
    
done

echo "Alt-text generation loop finished."
</code></pre></div></div>]]></content><author><name>Francesca Giannetti</name></author><category term="librarianship" /><summary type="html"><![CDATA[I recently got dinged with a number of mandatory changes to digital exhibits I had worked on with academic faculty ahead of an ADA Title II compliance deadline of April 2026 for websites at my institution. Aside from some embeds and links for objects that have seemingly gone missing from the web, most of these changes boiled down to alt text for images. I honestly couldn’t fathom doing this for possibly hundreds of images, and so I decided to figure out how to make Gemini do the work for me. I used Simon Willison’s llm utility with Gemini, based on his post. After some (Gemini-assisted) tweaks to the bash script, I ran it on a directory of images I took from one of the WordPress exhibits. It worked pretty well, I must say. On memes, on digitized archival photos, on really small images, and on a series of screen capped text message exchanges that one student had bafflingly included. A human somebody will still need to go through these and abbreviate and/or edit them for clarity and accuracy. For example, I caught one error that reminded me of why we shouldn’t ask LLMs to do math. Gemini correctly identified a Spongebob meme, but it miscounted the number of panels. But still. Not bad. I include the bash script below.]]></summary></entry><entry><title type="html">Using DH skills to do collections work</title><link href="https://francescagiannetti.com/librarianship/using-dh-skills-to-do-collections-work/" rel="alternate" type="text/html" title="Using DH skills to do collections work" /><published>2025-04-30T10:30:00-04:00</published><updated>2025-04-30T10:30:00-04:00</updated><id>https://francescagiannetti.com/librarianship/using-dh-skills-to-do-collections-work</id><content type="html" xml:base="https://francescagiannetti.com/librarianship/using-dh-skills-to-do-collections-work/"><![CDATA[<p>I recall once seeing a CFP about applications of digital humanities methods to (traditional) library work. Librarians are a notoriously risk-adverse group, and this call seemed oriented towards proving that the scary new thing could actually help our conservative profession do its bread-and-butter work. DH; it’s not so bad! I recall thinking at the time that this was sort of an interesting idea, that I had certainly done it, but nothing that rose to the level of a research article or conference paper. This post falls into the category of ordinary librarian task made easier by DH. Since libraries, and higher ed more generally, are once again facing alarming headwinds, and I’m hearing calls to “get back to basics,” and just not do the more advanced research library stuff for a while, I thought it might be advantageous to show in some detail how DH-y or data librarian-y skills can help with the more traditional subject librarian portfolio. <a href="https://mstdn.social/@mdlincoln">Matthew Lincoln</a> once quipped on the old site that 90% of DH was a table join, and I laughed because it was true, especially in the broader sense of comparing these data to those other data and drawing new insights from the juxtaposition. This post, however, is about a literal table join, lol.</p>

<p>Since summer is around the corner, and I have fewer course and workshop requests, I am turning my attention to collections weeding, which is to say pruning older, disused books to make way for newer materials. In addition to being a DH librarian, I am also a subject liaison to various language and literature fields, which means the responsibility for our congested P class falls more or less squarely on my shoulders. This section of the stacks has been a problem for many years, but since we also loan more P classified volumes than any other Library of Congress class, it never rose to the level of a crisis in my mind. Although lately our space issues have been multiplying, hence the added incentive to give it a look. Our print retention guidelines have this to say about withdrawing volumes:</p>

<blockquote>
  <p>A last copy book may be withdrawn if the book is of very low use and perpetual access to the electronic or digitized version is available in hathitrust.org or if at least five copies are available for borrowing in North American libraries identified in WorldCat. Library- and subject-specific criteria may be used to identify very low use books.</p>
</blockquote>

<p>I figured I’d be able to find large numbers of our P class volumes in HathiTrust if they had fallen into the public domain, which is to say published before 1930 as of this year. I also wanted to make sure that they hadn’t been checked out or used in house in a long while too, since just because something is in the public domain doesn’t mean that readers don’t want the print. In my experience, there are readers in every generation that favor print to digital for difficult, cognitively demanding reading. I did an analysis of our EZBorrow (ILL) data once and was stunned that Shakespeare regularly turned up in the most requested P class authors.</p>

<figure class=""><a href="/assets/images/pclass_authors.png" class="image-popup" title="Most requested P class authors. EZBorrow data, requested from Rutgers University.
"><img src="/assets/images/pclass_authors.png" alt="prototype with subject tags" /></a><figcaption>
      Most requested P class authors. EZBorrow data, requested from Rutgers University.

    </figcaption></figure>

<p>Specifically, these are the criteria I used to pull volumes:</p>

<ul>
  <li>Alexander Library (this is our main library for the humanities and social sciences)</li>
  <li>Open stacks (holding location for books that circulate; I didn’t want to draw volumes from any holding location that was historically non-circulating since the usage statistics would be artificially depressed)</li>
  <li>Library of Congress call number between P1 and PZ90 (languages and literatures, literary criticism too)</li>
  <li>Loaned and/or used in house fewer than 5 times total</li>
  <li>Last loan date earlier than July 31, 2013 (these last two data points were somewhat arbitrarily chosen to indicate low use)</li>
  <li>Publication date is earlier than 1930 (in other words, pre-copyright)</li>
</ul>

<p>I ended up with a list of 5,147 volumes. Next, I needed to figure out how to compare these volumes to the HathiTrust Digital Library holdings in full view. To do a table join, you will want to find a shared column of data between the two datasets. It’s much less error-prone to do this with a numeric identifier, rather than with a title, say, since a one-character difference makes a new title to the computer. After fumbling around for a bit, I figured out that we mostly had OCLC numbers for our books and so did HathiTrust. Hurray!</p>

<p>HathiTrust regularly publishes updated bibliographic data for their holdings called <a href="https://www.hathitrust.org/member-libraries/resources-for-librarians/data-resources/hathifiles/">hathifiles</a>. This is an incredibly wonderful resource for all kinds of bibliographic analysis. With that said, HTDL currently has over 18 million volumes, and these files are LARGE. They politely say on the site that the files “may be difficult to open with standard spreadsheet software or text editors,” which is something of an understatement. My MacBook Pro was definitely not up to the task, so I started by using the command line program <code class="language-plaintext highlighter-rouge">awk</code> to filter for only those HT volumes in the public domain. I knew from <a href="https://www.hathitrust.org/the-collection/preservation/rights-database/#attributes">HTDL’s rights code attributes list</a> that I just needed the volumes marked “pd” and “pdus” (I didn’t look at the cc licenses because they usually indicate a newer work).</p>

<table>
  <thead>
    <tr>
      <th style="text-align: left">htid</th>
      <th style="text-align: left">access</th>
      <th style="text-align: left">rights</th>
      <th style="text-align: right">ht_bib_key</th>
      <th style="text-align: left">description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: left">mdp.39015018415946</td>
      <td style="text-align: left">deny</td>
      <td style="text-align: left">ic</td>
      <td style="text-align: right">1</td>
      <td style="text-align: left">v.5</td>
    </tr>
    <tr>
      <td style="text-align: left">mdp.39015066356547</td>
      <td style="text-align: left">deny</td>
      <td style="text-align: left">ic</td>
      <td style="text-align: right">1</td>
      <td style="text-align: left">v.1</td>
    </tr>
    <tr>
      <td style="text-align: left">mdp.39015066356406</td>
      <td style="text-align: left">deny</td>
      <td style="text-align: left">ic</td>
      <td style="text-align: right">1</td>
      <td style="text-align: left">v.2</td>
    </tr>
    <tr>
      <td style="text-align: left">mdp.39015066356695</td>
      <td style="text-align: left">deny</td>
      <td style="text-align: left">ic</td>
      <td style="text-align: right">1</td>
      <td style="text-align: left">v.3</td>
    </tr>
    <tr>
      <td style="text-align: left">mdp.39015066356554</td>
      <td style="text-align: left">deny</td>
      <td style="text-align: left">ic</td>
      <td style="text-align: right">1</td>
      <td style="text-align: left">v.4</td>
    </tr>
    <tr>
      <td style="text-align: left">uc1.$b759626</td>
      <td style="text-align: left">deny</td>
      <td style="text-align: left">ic</td>
      <td style="text-align: right">1</td>
      <td style="text-align: left">v. 1</td>
    </tr>
    <tr>
      <td style="text-align: left">uc1.$b759627</td>
      <td style="text-align: left">deny</td>
      <td style="text-align: left">ic</td>
      <td style="text-align: right">1</td>
      <td style="text-align: left">v. 2</td>
    </tr>
    <tr>
      <td style="text-align: left">uc1.$b759628</td>
      <td style="text-align: left">deny</td>
      <td style="text-align: left">ic</td>
      <td style="text-align: right">1</td>
      <td style="text-align: left">v. 3</td>
    </tr>
    <tr>
      <td style="text-align: left">mdp.39015033913115</td>
      <td style="text-align: left">deny</td>
      <td style="text-align: left">ic</td>
      <td style="text-align: right">2</td>
      <td style="text-align: left"> </td>
    </tr>
    <tr>
      <td style="text-align: left">mdp.39015061455294</td>
      <td style="text-align: left">deny</td>
      <td style="text-align: left">ic</td>
      <td style="text-align: right">3</td>
      <td style="text-align: left"> </td>
    </tr>
  </tbody>
</table>

<p>In looking at the first few columns of the HTDL metadata, I saw that the values I needed were in the third column (“rights”). IC stands for in copyright, which is true of a majority of HTDL volumes.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>awk -F "\t" '{ if($3 ~ /pd|pdus/) { print &gt;&gt; "hathi_full_pd.txt" }}' hathi_full_20250401.txt
</code></pre></div></div>

<p>Above is the command I used to tell <code class="language-plaintext highlighter-rouge">awk</code> that the input file is tab delimited (<code class="language-plaintext highlighter-rouge">-F</code> stands for field separator), and that any rows with “pd” or “pdus” values in the third column should get put into a new file called “hathi_full_pd.txt.”<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup></p>

<p>Next, I ran the following command to see how many rows that left me with.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> wc -l hathi_full_pd.txt
7745163 hathi_full_pd.txt
</code></pre></div></div>

<p>Seven million plus rows is still too many for me to bother trying to open the file on my personal laptop. After experimenting a bit with the sampling methods in R, I decided to segment the whole dataset using a unix command line utility called split.<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup> I made a somewhat arbitrary choice to split the file every 250 thousand rows using the following command. This resulted in 31 files named segmentaa, segmentab, segmentac, and so on. Now I could finally begin to manipulate the HTDL data using the R programming language on my laptop.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>split -l 250000 hathi_full_pd.txt segment
</code></pre></div></div>

<p>Loading the data into my RStudio session looked like this. HathiTrust publishes their hathifiles without a header row, so while I’m reading in all those segment files, I am simultaneously adding a header row using a function. The resulting objects in my environment that I am going to interact with are called <code class="language-plaintext highlighter-rouge">alex_pclass_oclc</code> and <code class="language-plaintext highlighter-rouge">htseg01</code> through <code class="language-plaintext highlighter-rouge">htseg31</code>.</p>

<div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># load library data</span><span class="w">
</span><span class="n">alex_pclass</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">read_csv</span><span class="p">(</span><span class="s2">"alex_alstackc_pclass_fewerthan5circs_pre1930pub_oclc.csv"</span><span class="p">)</span><span class="w">
</span><span class="n">alex_pclass</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">rename</span><span class="p">(</span><span class="n">alex_pclass</span><span class="p">,</span><span class="w"> </span><span class="n">oclc</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">`OCLC Control Number (035a)`</span><span class="p">)</span><span class="w">
</span><span class="c1"># some don't have oclc numbers and will have to be manually checked later</span><span class="w">
</span><span class="n">alex_pclass_oclc</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">alex_pclass</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="n">filter</span><span class="p">(</span><span class="o">!</span><span class="nf">is.na</span><span class="p">(</span><span class="n">oclc</span><span class="p">))</span><span class="w">

</span><span class="c1"># HTDL field names. I made one small change from the HTDL and am calling the column "oclc" instead of "oclc_num" to be able to write a shorter join function</span><span class="w">
</span><span class="n">headers</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"htid"</span><span class="p">,</span><span class="s2">"access"</span><span class="p">,</span><span class="s2">"rights"</span><span class="p">,</span><span class="s2">"ht_bib_key"</span><span class="p">,</span><span class="s2">"description"</span><span class="p">,</span><span class="s2">"source"</span><span class="p">,</span><span class="s2">"source_bib_num"</span><span class="p">,</span><span class="s2">"oclc"</span><span class="p">,</span><span class="s2">"isbn"</span><span class="p">,</span><span class="s2">"issn"</span><span class="p">,</span><span class="s2">"lccn"</span><span class="p">,</span><span class="s2">"title"</span><span class="p">,</span><span class="s2">"imprint"</span><span class="p">,</span><span class="s2">"rights_reason_code"</span><span class="p">,</span><span class="s2">"rights_timestamp"</span><span class="p">,</span><span class="s2">"us_gov_doc_flag"</span><span class="p">,</span><span class="s2">"rights_date_used"</span><span class="p">,</span><span class="s2">"pub_place"</span><span class="p">,</span><span class="s2">"lang"</span><span class="p">,</span><span class="s2">"bib_fmt"</span><span class="p">,</span><span class="s2">"collection_code"</span><span class="p">,</span><span class="s2">"content_provider_code"</span><span class="p">,</span><span class="s2">"responsible_entity_code"</span><span class="p">,</span><span class="s2">"digitization_agent_code"</span><span class="p">,</span><span class="s2">"access_profile_code"</span><span class="p">,</span><span class="s2">"author"</span><span class="p">)</span><span class="w">

</span><span class="c1"># function to read file with some arguments and add headers</span><span class="w">
</span><span class="n">ht_read</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">file_path</span><span class="p">,</span><span class="w"> </span><span class="n">column_names</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="n">data</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">read_delim</span><span class="p">(</span><span class="n">file_path</span><span class="p">,</span><span class="w"> </span><span class="n">col_names</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">,</span><span class="w"> </span><span class="n">delim</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'\t'</span><span class="p">,</span><span class="w"> </span><span class="n">col_types</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cols</span><span class="p">(</span><span class="n">.default</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"c"</span><span class="p">))</span><span class="w">
  </span><span class="n">colnames</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">column_names</span><span class="w">
  </span><span class="nf">return</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="c1"># read all the HT segment datasets in as a list of dataframes and explode into individual dataframes</span><span class="w">
</span><span class="n">files</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">Sys.glob</span><span class="p">(</span><span class="s2">"segment*"</span><span class="p">)</span><span class="w">
</span><span class="n">file_list</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">lapply</span><span class="p">(</span><span class="n">files</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">file</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="n">ht_read</span><span class="p">(</span><span class="n">file</span><span class="p">,</span><span class="w"> </span><span class="n">headers</span><span class="p">)</span><span class="w">
</span><span class="p">})</span><span class="w">
</span><span class="nf">names</span><span class="p">(</span><span class="n">file_list</span><span class="p">)</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">sprintf</span><span class="p">(</span><span class="s2">"htseg%02d"</span><span class="p">,</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">file_list</span><span class="p">))</span><span class="w">
</span><span class="n">list2env</span><span class="p">(</span><span class="n">file_list</span><span class="p">,</span><span class="w"> </span><span class="n">envir</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">.GlobalEnv</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>

<p>While attempting to convert the OCLC numbers to a numeric data type, I discovered that the HTDL sometimes stores more than one OCLC number per cell. This will interfere with my table join, so I need to split up those values into separate columns. For the time being, I decided not to bother checking anything other than the first OCLC number, but I may circle back and look at the others later. I wrote this function to split up the values, keep the first value in a column called <code class="language-plaintext highlighter-rouge">oclc</code>, store the second value in <code class="language-plaintext highlighter-rouge">oclc2</code>, and dump anything after that. Then, the for loop applies the function to each segment dataframe in my environment.</p>

<div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># separate oclc values, if more than one, and re-type as numeric</span><span class="w">
</span><span class="n">oclc_split</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="n">column_name</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="n">split_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">strsplit</span><span class="p">(</span><span class="nf">as.character</span><span class="p">(</span><span class="n">data</span><span class="p">[[</span><span class="n">column_name</span><span class="p">]]),</span><span class="w"> </span><span class="s2">","</span><span class="p">)</span><span class="w">
  </span><span class="n">data</span><span class="o">$</span><span class="n">oclc</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">sapply</span><span class="p">(</span><span class="n">split_values</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="n">ifelse</span><span class="p">(</span><span class="nf">length</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="o">&gt;=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="p">[</span><span class="m">1</span><span class="p">],</span><span class="w"> </span><span class="kc">NA</span><span class="p">))</span><span class="w">
  </span><span class="n">data</span><span class="o">$</span><span class="n">oclc</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="nf">as.numeric</span><span class="p">(</span><span class="n">data</span><span class="o">$</span><span class="n">oclc</span><span class="p">)</span><span class="w">
  </span><span class="n">data</span><span class="o">$</span><span class="n">oclc2</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">sapply</span><span class="p">(</span><span class="n">split_values</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="n">ifelse</span><span class="p">(</span><span class="nf">length</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="o">&gt;=</span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="p">[</span><span class="m">2</span><span class="p">],</span><span class="w"> </span><span class="kc">NA</span><span class="p">))</span><span class="w">
  </span><span class="nf">return</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="c1"># split oclc column in each df</span><span class="w">
</span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="m">31</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="n">df_name</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">sprintf</span><span class="p">(</span><span class="s2">"htseg%02d"</span><span class="p">,</span><span class="w"> </span><span class="n">i</span><span class="p">)</span><span class="w">
  </span><span class="n">df</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">get</span><span class="p">(</span><span class="n">df_name</span><span class="p">)</span><span class="w">
  </span><span class="n">df</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">oclc_split</span><span class="p">(</span><span class="n">df</span><span class="p">,</span><span class="w"> </span><span class="s2">"oclc"</span><span class="p">)</span><span class="w">
  </span><span class="n">assign</span><span class="p">(</span><span class="n">df_name</span><span class="p">,</span><span class="w"> </span><span class="n">df</span><span class="p">,</span><span class="w"> </span><span class="n">envir</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">.GlobalEnv</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Finally, I arrived at the point where I could join the HathiTrust data to our library P class data. I’m using the <code class="language-plaintext highlighter-rouge">semi_join()</code> function from the dplyr library to keep any and all rows in my library’s data that match an OCLC number (just the first one, remember) in the HathiTrust data. There’s another for loop to go through all those HTDL segments one by one and copy the matches to the exact same objects. The last function rolls all the matches together into one dataframe that I can then write to file. There were 3,632 matches out of 5,147, which is pretty good!</p>

<div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># join with alex p class data</span><span class="w">
</span><span class="n">oclc_compare</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">dataset1</span><span class="p">,</span><span class="w"> </span><span class="n">dataset2</span><span class="p">,</span><span class="w"> </span><span class="n">shared_column</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="n">matches</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">semi_join</span><span class="p">(</span><span class="n">dataset1</span><span class="p">,</span><span class="w"> </span><span class="n">dataset2</span><span class="p">,</span><span class="w"> </span><span class="n">by</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">shared_column</span><span class="p">)</span><span class="w">
  </span><span class="nf">return</span><span class="p">(</span><span class="n">matches</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="c1"># apply oclc_compare to each df</span><span class="w">
</span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="m">31</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="n">df_name</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">sprintf</span><span class="p">(</span><span class="s2">"htseg%02d"</span><span class="p">,</span><span class="w"> </span><span class="n">i</span><span class="p">)</span><span class="w">
  </span><span class="n">df</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">get</span><span class="p">(</span><span class="n">df_name</span><span class="p">)</span><span class="w">
  </span><span class="n">df</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">oclc_compare</span><span class="p">(</span><span class="n">alex_pclass_oclc</span><span class="p">,</span><span class="w"> </span><span class="n">df</span><span class="p">,</span><span class="w"> </span><span class="s2">"oclc"</span><span class="p">)</span><span class="w">
  </span><span class="n">assign</span><span class="p">(</span><span class="n">df_name</span><span class="p">,</span><span class="w"> </span><span class="n">df</span><span class="p">,</span><span class="w"> </span><span class="n">envir</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">.GlobalEnv</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="c1"># roll the results into one dataframe</span><span class="w">
</span><span class="n">data_list</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="nf">list</span><span class="p">()</span><span class="w">
</span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="m">31</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="n">df_name</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">sprintf</span><span class="p">(</span><span class="s2">"htseg%02d"</span><span class="p">,</span><span class="w"> </span><span class="n">i</span><span class="p">)</span><span class="w">
  </span><span class="n">df</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">get</span><span class="p">(</span><span class="n">df_name</span><span class="p">)</span><span class="w">
  </span><span class="n">data_list</span><span class="p">[[</span><span class="n">i</span><span class="p">]]</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">df</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="n">combined_data</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">bind_rows</span><span class="p">(</span><span class="n">data_list</span><span class="p">)</span><span class="w">

</span><span class="n">write_csv</span><span class="p">(</span><span class="n">combined_data</span><span class="p">,</span><span class="w"> </span><span class="s2">"alex_pclass_in_htdl_fullview.csv"</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>

<p>It’s likely that we will withdraw most of these books, freeing up some desperately needed space on our second and third floors. Scanning the titles gave me a few moments of doubt (Dinah Craik! Anatole France. Joaquin Miller. Really?), but in pretty much every case these were works that are widely held, often elsewhere in our selfsame libraries, either in the same or in later editions. Book historians and scholarly editors know that different editions of a work are not “copies” and not interchangeable, but the reality for us is that we probably no longer need ten versions of Theodore Dreiser’s <em>An American Tragedy</em>, given the pressure on our spaces. There was also plenty of, dare I say it, B and C list literature, for which HathiTrust digital access is certainly good enough. I paused when I saw <a href="https://catalog.hathitrust.org/Record/012453512"><em>Some queer Americans and other stories</em></a>… Surely not “queer” like that, right? Indeed, it is a peculiar tome exoticizing and patronizing the residents of rural Appalachia. There were also plenty of vaguely prurient sounding titles like <a href="https://catalog.hathitrust.org/Record/102426649"><em>The College Widow</em></a> and <a href="https://catalog.hathitrust.org/Record/001030910"><em>The story of a bad boy</em></a>, which were funny to look at, but again, nothing to agonize over in terms of keeping the print.</p>

<p>Incidentally, I am rusty at writing functions in R, and after fumbling through some Google search results, I caved and opened my institution’s Copilot instance. While the results weren’t always just right on the first or second try, I have to admit that Copilot saved me a ton of time sifting through stackoverflow replies.</p>

<p>Find the whole script as a GitHub Gist at <a href="https://gist.github.com/giannetti/752ff7760f633f7cbfd194a5a1212948">https://gist.github.com/giannetti/752ff7760f633f7cbfd194a5a1212948</a>.</p>

<hr />

<h2 id="addendum-added-2025-05-15">Addendum added 2025-05-15</h2>

<p>It turns out we are conservative about weeding too (ha). I was asked to show if we have any print duplication of these volumes in our library system. And so I once again relied on the fact that we mostly have OCLC numbers to do another join. I queried the analytics tool of our catalog for all P class volumes NOT in Alexander Library and came up with a list.</p>

<div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># draw everything in p class that is not in Alex, dropping low use filters b/c not important</span><span class="w">
</span><span class="n">notalex_pclass</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">read_csv</span><span class="p">(</span><span class="s2">"NOTalex_pclass_pre1930.csv"</span><span class="p">)</span><span class="w">
</span><span class="n">notalex_pclass</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">rename</span><span class="p">(</span><span class="n">notalex_pclass</span><span class="p">,</span><span class="w"> </span><span class="n">oclc</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">`OCLC Control Number (035a)`</span><span class="p">)</span><span class="w">
</span><span class="n">notalex_pclass</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">notalex_pclass</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="n">filter</span><span class="p">(</span><span class="o">!</span><span class="nf">is.na</span><span class="p">(</span><span class="n">oclc</span><span class="p">))</span><span class="w">
</span></code></pre></div></div>

<p>I did a slightly modified join with the HathiTrust volumes to show my receipts, once again looping through all the segment dataframes. A note that I used the <code class="language-plaintext highlighter-rouge">multiple</code> argument of “any,” which will capture <strong>any</strong> match in the HTDL dataset. It doesn’t really matter to me how many matches there are, just so long as there is one. I also programmatically created an HTDL URI based on the pattern they follow, so that any user can hopefully just click a link to see the digital full view.</p>

<div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># trying this differently to retain fields from HTDL</span><span class="w">
</span><span class="n">oclc_match</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">dataset1</span><span class="p">,</span><span class="w"> </span><span class="n">dataset2</span><span class="p">,</span><span class="w"> </span><span class="n">shared_column</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="n">matches</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">inner_join</span><span class="p">(</span><span class="n">dataset1</span><span class="p">,</span><span class="w"> </span><span class="n">dataset2</span><span class="p">,</span><span class="w"> </span><span class="n">by</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">shared_column</span><span class="p">,</span><span class="w"> </span><span class="n">multiple</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"any"</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
    </span><span class="n">mutate</span><span class="p">(</span><span class="n">uri</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">paste0</span><span class="p">(</span><span class="s2">"https://hdl.handle.net/2027/"</span><span class="p">,</span><span class="w"> </span><span class="n">htid</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">""</span><span class="p">))</span><span class="w">
  </span><span class="nf">return</span><span class="p">(</span><span class="n">matches</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Next, I did another inner join with the “not Alex” dataset to find any match in our libraries. The resulting dataframe, <code class="language-plaintext highlighter-rouge">print_overlap</code> now has redundant metadata fields, which I pruned a bit to show the esssential. This join showed that we have 1,204 P class volumes in Alex that are duplicated elsewhere in our libraries and available in full view in HathiTrust. Hey, this should still help recover space!</p>

<div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">print_overlap</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">combined_data</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">inner_join</span><span class="p">(</span><span class="n">notalex_pclass</span><span class="p">,</span><span class="w"> </span><span class="n">by</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"oclc"</span><span class="p">,</span><span class="w"> </span><span class="n">multiple</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"any"</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>See <a href="https://www.tim-dennis.com/data/tech/2016/08/09/using-awk-filter-rows.html">“Using AWK to Filter Rows”</a> by Tim Dennis for a very handy tutorial. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>For more split examples, see <a href="https://servicenow.iu.edu/kb?id=kb_article_view&amp;sysparm_article=KB0026049">https://servicenow.iu.edu/kb?id=kb_article_view&amp;sysparm_article=KB0026049</a>. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Francesca Giannetti</name></author><category term="librarianship" /><summary type="html"><![CDATA[I recall once seeing a CFP about applications of digital humanities methods to (traditional) library work. Librarians are a notoriously risk-adverse group, and this call seemed oriented towards proving that the scary new thing could actually help our conservative profession do its bread-and-butter work. DH; it’s not so bad! I recall thinking at the time that this was sort of an interesting idea, that I had certainly done it, but nothing that rose to the level of a research article or conference paper. This post falls into the category of ordinary librarian task made easier by DH. Since libraries, and higher ed more generally, are once again facing alarming headwinds, and I’m hearing calls to “get back to basics,” and just not do the more advanced research library stuff for a while, I thought it might be advantageous to show in some detail how DH-y or data librarian-y skills can help with the more traditional subject librarian portfolio. Matthew Lincoln once quipped on the old site that 90% of DH was a table join, and I laughed because it was true, especially in the broader sense of comparing these data to those other data and drawing new insights from the juxtaposition. This post, however, is about a literal table join, lol.]]></summary></entry><entry><title type="html">Post: Building the Historical Maps of New Jersey website</title><link href="https://francescagiannetti.com/blog/building-the-historical-maps-of-new-jersey/" rel="alternate" type="text/html" title="Post: Building the Historical Maps of New Jersey website" /><published>2025-04-03T00:00:00-04:00</published><updated>2025-04-03T00:00:00-04:00</updated><id>https://francescagiannetti.com/blog/building-the-historical-maps-of-new-jersey</id><content type="html" xml:base="https://francescagiannetti.com/blog/building-the-historical-maps-of-new-jersey/"><![CDATA[<p>An interview of Mike Siegel of the department of Geography, with Sue Oldenburg.</p>]]></content><author><name>Francesca Giannetti</name></author><category term="Blog" /><category term="GIS" /><summary type="html"><![CDATA[An interview of Mike Siegel of the department of Geography, with Sue Oldenburg.]]></summary></entry><entry><title type="html">Notes from the Front: Resisting Maximalism in Your Minimal Edition</title><link href="https://francescagiannetti.com/tei2024/" rel="alternate" type="text/html" title="Notes from the Front: Resisting Maximalism in Your Minimal Edition" /><published>2024-10-08T10:30:00-04:00</published><updated>2024-10-08T10:30:00-04:00</updated><id>https://francescagiannetti.com/tei2024</id><content type="html" xml:base="https://francescagiannetti.com/tei2024/"><![CDATA[<h2 id="abstract">Abstract</h2>

<p>Minimal editions have become popular because they answer the problem of curating texts under a set of constraints. However, encoding and publication workflows, even for minimal editions, run the risk of becoming bloated if the editor loses sight of the project’s priorities. Over a project’s life cycle, it is normal that these priorities will shift under the pressure of this or that constraint or challenge. Several authors have noted that minimal computing stacks displace complexity away from users and onto the editor or technical partner (Dombrowski 2022; Giannetti 2019; Hughes 2016). Identifying the <em>necessary</em> technical complexity is rarely easy. I focus on the latter two of Risam and Gil’s four-question heuristic for minimal computing—“what must we prioritize?”; and “what are we willing to give up?”—in order to demonstrate the importance of reengaging with these questions as new challenges come to light (Risam and Gil 2022).</p>

<p>With the Personal Correspondence from the Rutgers College War Service Bureau, I initially made choices consonant with a minimal approach. The edition navigation replicated the file structure of the archival collection, organized by the name of the alumnus serving in World War I. The project schema exhibited straightforward choices regarding the encoding of people, places, and events. However, the plan to capture biographical information on each Rutgers person mentioned nearly ran aground when a single soldier mentioned fifty classmates in his letters. A desire for a more enriching reader experience led to a choice to provide subject access to the letters, which in turn brought classification and UI difficulties. In this case study, I provide substance to the hard decisions editors face amidst evolving priorities and the need to curtail some plans. Documentation of best practices in this area will serve other editors who must make pragmatic choices in the service of local knowledge production.</p>

<h2 id="short-paper">Short Paper</h2>

<p><em>Note: You may <a href="https://slides.francescagiannetti.com/tei2024.html">find my slides here</a>, although I have replicated most visual supports below.</em></p>

<p>What is a minimal edition? It might mean more than one thing at a time. For some, it will mean a digital edition built using a minimal computing approach, by which I mean editorial projects that focus on necessities and reduce technological and resource dependencies. For others, a minimal edition will denote an editorial approach, with the goal of creating something like a reading edition of a text, which is to say a singular text, possibly accompanied by explanatory notes (Vanhoutte 2009). In yet a third case, a minimal edition could mean both things at the same time. This is the definition I am using to describe my approach to an online pedagogical edition: the <a href="https://rutgersdh.github.io/warservicebureau/">Personal Correspondence from the Rutgers College War Service Bureau</a>. Using a four-question heuristic proposed by Alex Gil and Roopika Risam as a launching point, I will take us on a project management tour through wants, needs, uncertainties, challenges, and some victories (Risam and Gil 2022). A minimal approach to a text can easily balloon to a maximal one when wants get in the way of outcomes. Regularly engaging with the questions—“what must we prioritize?” and “what are we willing to give up?”—has been key to pushing through various impasses. In this talk, I will describe a few of these challenges, and compare my choices to those of editors of another minimal edition. I will conclude with a few observations that I hope will assist editors facing similar challenges.</p>

<p>The records of the Rutgers College War Service Bureau is a collection of university records from the period of US engagement in World War I. The records include correspondence between the director and Rutgers men serving in the war. My involvement in the project began when digitization was complete; my role was to create opportunities to engage student researchers with the digitized materials. Given my background in text encoding, I naturally thought of a digital edition.</p>

<p>I will quickly dispatch with the first two questions Gil and Risam pose us. What did I need? An online edition of some sort so that students could have something to point to when they completed their work with me. What did I have? I was already aware of the Jekyll static site generator and the Ed theme for minimal editions, and it seemed natural to use it after making a handful of modifications to suit the features of correspondence. My troubles began with the second two questions, which in my experience really speak to the decision making aspects of the project plan.</p>

<p>As project director, I was highly motivated to learn several things myself in addition to teaching students. And I lacked the typical project management constraints that would have forced me to make binding decisions sooner. First, I have no real “client” unless it is the student encoders themselves, which switches the focus from outputs to processes. And second, this project has no funding (funders typically impose strict timelines). When I published the first letter anthologies to the web, I made a decision to replicate the structure of the archival collection itself, which is organized into files of correspondence associated with individual alumni. This structuring choice also made sense from pedagogical perspective. It is easy to assign an alumnus to one or more students so that they can spend time with a single person’s handwriting and learn more about them through archival and genealogical research. However, this choice eventually posed a challenge for online navigation, since access by alumnus name was meaningless to most readers. At an NEH institute on “Advanced Digital Editing,” I was encouraged to create wireframes for various possible interfaces. At the time, I had the institute’s chosen platform, the very much not minimal eXist-db, in mind when I drew these images, showing access by location, subject, and date. Location information struck me as important to keep, since there are many locations mentioned in the letters that are training camps throughout the US and Canada, and of course sites along or near the Western Front, and which would not be especially familiar to most readers. Consequently, part of my learning would be to extract location information from the <code class="language-plaintext highlighter-rouge">correspDesc</code> and <code class="language-plaintext highlighter-rouge">listPlace</code> elements and visualize them in an online map, preferably using open source software. I was also very attached to the idea of subject access to individual letters, which would mean breaking up my letter anthologies into individual letters classed by subject. I had no good idea of how to accomplish this without a database, which was not an option for this project.</p>

<figure class="third full">
  
    
      <a href="/assets/images/wireframe_1.png" title="Location access wireframe">
          <img src="/assets/images/wireframe_1.png" alt="Location access wireframe" />
      </a>
    
  
    
      <a href="/assets/images/wireframe_2.png" title="Subject access wireframe">
          <img src="/assets/images/wireframe_2.png" alt="Subject access wireframe" />
      </a>
    
  
    
      <a href="/assets/images/wireframe_3.png" title="Date access wireframe">
          <img src="/assets/images/wireframe_3.png" alt="Date access wireframe" />
      </a>
    
  
  
    <figcaption>Wireframes of interface showing access by location, date, and subject
</figcaption>
  
</figure>

<p>One of the felicitous aspects of my participation in that NEH institute was a realization that I had to cede some control of the edition to the student interns working with me. I did not myself have all the answers, and it was quite possible that they would have better, more actionable ideas. Madiha Maajid, an intern who was a computer science major, listened to my thoughts and hopes for subject access and my exhortation to avoid databases, and created a prototype that kept the primary access by alumnus but added secondary access by subject. This solution did not require a radical reorganization of the existing edition. She used only JavaScript, HTML, and CSS.</p>

<figure class=""><a href="/assets/images/madiha_subject-tags.png" class="image-popup" title="Prototype with subject tags, by M. Maajid.
"><img src="/assets/images/madiha_subject-tags.png" alt="prototype with subject tags" /></a><figcaption>
      Prototype with subject tags, by M. Maajid.

    </figcaption></figure>

<p>Maajid also created a more image-friendly mock-up of the interface, which I have yet to implement.</p>

<figure class=""><a href="/assets/images/madiha_alumni-view.png" class="image-popup" title="Image-based interface mockup, by M. Maajid.
"><img src="/assets/images/madiha_alumni-view.png" alt="image-based interface mockup" /></a><figcaption>
      Image-based interface mockup, by M. Maajid.

    </figcaption></figure>

<p>Following her lead, I gave up a location-based interface and instead added a Leaflet map only to those letter anthologies where it might enhance a reader’s understanding of the letters.</p>

<figure class=""><img src="/assets/images/ainsworth-map.png" alt="Locations of correspondents and places mentioned in correspondence associated with Captain Ainsworth" /><figcaption>
      Locations of correspondents and places mentioned in correspondence associated with Captain Ainsworth.

    </figcaption></figure>

<p>Some alumni moved about the United States, France, and elsewhere, while others stayed in the same location for their whole war; the latter men didn’t need maps.</p>

<p>The scope of the navigation issue, and the required technologies, seemed to be narrowing, which was progress. That left the challenge of adding subject classification to the encodings and the interface. In a first pass, the students and I use the Hypothesis annotation tool to make a preliminary assignment of subjects with links to Library of Congress authorities.</p>

<figure class=""><a href="/assets/images/hypothesis-annotations.png" class="image-popup" title="First pass of subject classification using Hypothesis annotation tool.
"><img src="/assets/images/hypothesis-annotations.png" alt="First pass of subject classification using Hypothesis annotation tool" /></a><figcaption>
      First pass of subject classification using Hypothesis annotation tool.

    </figcaption></figure>

<p>Of the many potential themes to highlight, we seek the ones that are descriptive of individuals as well as ones with the potential to carry over to other anthologies, but no more than 20 subjects per individual letter anthology. We use Hypothesis to highlight the segment of the letter that is about that subject. In a second step, we add the subjects to the encodings themselves via a <code class="language-plaintext highlighter-rouge">keywords</code> element in the <code class="language-plaintext highlighter-rouge">teiHeader</code> and within the letter divs. We are using something of a shortcut and adding the subjects to the <code class="language-plaintext highlighter-rouge">@ana</code> attribute inside of the opening tag of the letter <code class="language-plaintext highlighter-rouge">div</code>. Compare this approach to that of <a href="https://scholarlyediting.org/issues/41/kinship-and-longing/">Kinship &amp; Longing: Keywords for Black Louisiana</a>.</p>

<figure class=""><a href="/assets/images/wsb-subjects.png" class="image-popup" title="War Service Bureau subjects.
"><img src="/assets/images/wsb-subjects.png" alt="War Service Bureau subjects" /></a><figcaption>
      War Service Bureau subjects.

    </figcaption></figure>

<figure class=""><a href="/assets/images/kinship-longing.png" class="image-popup" title="Kinship and Longing keywords.
"><img src="/assets/images/kinship-longing.png" alt="Kinship and Longing keywords" /></a><figcaption>
      Kinship and Longing keywords.

    </figcaption></figure>

<p>In terms of textual evidence, the Kingship &amp; Longing approach is preferable because the editors keep their <code class="language-plaintext highlighter-rouge">//seg/@ana</code> encoding attached to the sentences that inspired them. Their commitment to a minimal edition is manifested in their choice of only four keywords across the documents. I couldn’t commit to so few subject terms, but I did cut a corner on the placement of the <code class="language-plaintext highlighter-rouge">@ana</code> attributes, which saves considerable time. On the front end, with the addition of some JavaScript, the subjects become clickable buttons that open and close the letter anthologies like an accordion, alleviating an issue of overly long web pages.</p>

<h2 id="concluding-thoughts">Concluding Thoughts</h2>

<p>In comparing my idealized, eXist-db inspired wireframes to where we ended, it becomes apparent that there is tension between minimalism (editorial or computational) and a flexible user experience. Note that in the long list of minimals from the <a href="http://go-dh.github.io/mincomp/">GO::DH Minimal Computing Working Group</a> no one mentioned “minimal friction,” because there is often some user friction with minimal interfaces. Access points were eliminated because we could not support them within our constraints. The trick is perhaps finding an elegant balance between friction and user friendliness.</p>

<p>Minimalism as an approach felt at times like a straightjacket; it meant giving up on some of my personal learning objectives. But I also find the four-question heuristic to be seriously useful. Just because letter metadata can be visualized in all kinds of ways does not make that approach meaningful for every text. And as I eventually realized, the war theme and the focus on trauma required a respectful sobriety and simplicity, <em>not</em> tons of interactivity.</p>

<p>Finishing, at least provisionally, is important. I spent a lot of time in a deficit mindset, focusing on wants and avoiding hard choices. Being able to point to some kind of output is important for institutional validation, certainly. But it also matters for building confidence as an editor. And there is some catharsis in documenting the tradeoffs, as I have done with you today.</p>

<h2 id="works-cited">Works cited</h2>

<p>Dombrowski, Quinn. 2022. “Minimizing Computing Maximizes Labor.” <em>Digital Humanities Quarterly</em> 16 (2). <a href="http://www.digitalhumanities.org/dhq/vol/16/2/000594/000594.html">http://www.digitalhumanities.org/dhq/vol/16/2/000594/000594.html</a>.</p>

<p>Giannetti, Francesca. 2019. “‘So near While Apart’: Correspondence Editions as Critical Library Pedagogy and Digital Humanities Methodology.” <em>The Journal of Academic Librarianship</em> 45 (5): 1–11. https://doi.org/10.1016/j.acalib.2019.05.001.</p>

<p>Hughes, Joel. 2016. “Minimal Definitions - Notes.” Minimal Computing (blog). October 7, 2016. <a href="http://go-dh.github.io/mincomp/thoughts/2016/10/07/minimal-definitions-notes/">http://go-dh.github.io/mincomp/thoughts/2016/10/07/minimal-definitions-notes/</a>.</p>

<p>Risam, Roopika, and Alex Gil. 2022. “Introduction: The Questions of Minimal Computing.” <em>Digital Humanities Quarterly</em> 16 (2). <a href="https://www.digitalhumanities.org/dhq/vol/16/2/000646/000646.html">https://www.digitalhumanities.org/dhq/vol/16/2/000646/000646.html</a>.</p>

<p>Vanhoutte, Edward. 2009. “Every Reader His Own Bibliographer – An Absurdity?” In <em>Text Editing, Print and the Digital World</em>, edited by Kathryn Sutherland and Marilyn Deegan. Farnham, England: Ashgate.</p>]]></content><author><name>Francesca Giannetti</name></author><category term="talks" /><category term="pedagogy" /><summary type="html"><![CDATA[This is the text of a talk I gave at the Text Encoding Initiative Conference in Buenos Aires (TEI2024).]]></summary></entry><entry><title type="html">Site Updates</title><link href="https://francescagiannetti.com/jekyll/update/site-updates/" rel="alternate" type="text/html" title="Site Updates" /><published>2024-06-13T18:04:00-04:00</published><updated>2024-06-13T18:04:00-04:00</updated><id>https://francescagiannetti.com/jekyll/update/site-updates</id><content type="html" xml:base="https://francescagiannetti.com/jekyll/update/site-updates/"><![CDATA[<p>I knew there was a reason I deferred website updates until the summer.</p>

<p>My hosting provider started to charge extra for running Ruby applications a while back. This effectively knocked out my Jekyll site generation workflow since I didn’t feel like paying. It was fine, I told myself, since I had been meaning to move to GitHub Pages for version control anyway. Previously, each time I updated my site, I effectively clobbered what was there before, which is not the greatest in terms of digital preservation. But! But. Getting everything working on GitHub Pages involved shaving a mountain full of yaks, reminding me that Ruby is kind of the worst. Search is the last thing to fix on this site, so hopefully I’ll get that working soon (or just admit failure and mask it all together, lol). <em>Edited 2024-06-15 to add search is working again. :D</em></p>

<p>Among the improvements: an updated headshot, a ✨Projects✨ section (like a proper DHer), updated socials, and links to a few blog posts I published elsewhere.</p>

<p>I won’t go into all the wrestling with Ruby and Jekyll, except to say that GitHub Pages caps you at Jekyll 3.9.5 and only works with select versions of Ruby (not the latest stable release!).<sup id="fnref:fn1" role="doc-noteref"><a href="#fn:fn1" class="footnote" rel="footnote">1</a></sup> <a href="https://jekyllrb.com/docs/continuous-integration/github-actions/">GitHub Actions</a> are supposed to be a workaround for using whichever versions you like, but it just didn’t work in my experience. Also, Ruby Version Manager (rvm) is such a headache, and I found rbenv to be easier (thanks to the Code4Lib Slack for setting me on the right path).</p>

<p>Probably the biggest unintended consequence of mapping a GitHub repo to my root domain was that all the little side projects and slide decks that were formed as baseurl/project were giving 404s now because everything was pointing to GitHub’s servers. Ha ha ha ha ha ha HA HA.. oh.. hmmm. This was also fine, I told myself, because I had been meaning to collect all my slide decks in one place, and you may find them now at <a href="https://slides.francescagiannetti.com/">https://slides.francescagiannetti.com/</a>. As for my never-ending technical experiments and bits and bobs, those have now been moved to <a href="https://sandbox.francescagiannetti.com/">https://sandbox.francescagiannetti.com/</a>. I have only made public those messes I more regularly need to find (<em>grins</em>).</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:fn1" role="doc-endnote">
      <p>See <a href="https://pages.github.com/versions/">https://pages.github.com/versions/</a>. <a href="#fnref:fn1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Francesca Giannetti</name></author><category term="jekyll" /><category term="update" /><summary type="html"><![CDATA[I knew there was a reason I deferred website updates until the summer.]]></summary></entry><entry><title type="html">Post: Open Humanities Panel</title><link href="https://francescagiannetti.com/blog/open-humanities/" rel="alternate" type="text/html" title="Post: Open Humanities Panel" /><published>2024-04-08T00:00:00-04:00</published><updated>2024-04-08T00:00:00-04:00</updated><id>https://francescagiannetti.com/blog/open-humanities</id><content type="html" xml:base="https://francescagiannetti.com/blog/open-humanities/"><![CDATA[<p>A panel event I co-organized on open humanities scholarship.</p>]]></content><author><name>Francesca Giannetti</name></author><category term="Blog" /><category term="scholarly communication" /><summary type="html"><![CDATA[A panel event I co-organized on open humanities scholarship.]]></summary></entry><entry><title type="html">Post: Musings on Web Development</title><link href="https://francescagiannetti.com/blog/web-development/" rel="alternate" type="text/html" title="Post: Musings on Web Development" /><published>2023-05-22T00:00:00-04:00</published><updated>2023-05-22T00:00:00-04:00</updated><id>https://francescagiannetti.com/blog/web-development</id><content type="html" xml:base="https://francescagiannetti.com/blog/web-development/"><![CDATA[<p>Something I wrote for the Rutgers DHI website about navigating the terrain of online publishing, with accompanying grumbles about digital project maintenance.</p>]]></content><author><name>Francesca Giannetti</name></author><category term="Blog" /><category term="maintenance" /><summary type="html"><![CDATA[Something I wrote for the Rutgers DHI website about navigating the terrain of online publishing, with accompanying grumbles about digital project maintenance.]]></summary></entry><entry><title type="html">Using VS Code for TEI Editing</title><link href="https://francescagiannetti.com/vscode-for-tei/" rel="alternate" type="text/html" title="Using VS Code for TEI Editing" /><published>2023-02-24T22:00:00-05:00</published><updated>2023-02-24T22:00:00-05:00</updated><id>https://francescagiannetti.com/using-vs-code-for-tei-editing</id><content type="html" xml:base="https://francescagiannetti.com/vscode-for-tei/"><![CDATA[<h2 id="overview">Overview</h2>

<p>In scrounging around the internet for a post on setting up Visual Studio Code for text encoding, I realized to my annoyance that there wasn’t one. That’s not really all that surprising, given the average shelf life of these tools. It wasn’t all that long ago that I bookmarked Andrew Dunning’s excellent <a href="https://andrewdunning.ca/getting-started-editing-tei-xml-atom">“Getting started Getting started with editing TEI XML using Atom”</a> (RIP Atom 😢). Why bother going through this again for the current text editor du jour? Well, if you are like me, you work with students who are often rather computer savvy but who still need clear instructions. So here I am blogging my own project documentation for myself and possibly for you.</p>

<p>My context for writing this post is a pedagogical editorial project called the <a href="https://rutgersdh.github.io/warservicebureau/">Correspondence of the Rutgers College War Service Bureau</a>. I periodically work with student interns who are interested in working on projects at the intersection of archives, history, and technology. This World War I project is all that plus valuable institutional history. The letters by and about these Rutgers men give us a glimpse of early twentieth century college life (Rutgers wouldn’t become a university until 1924) and touch upon themes common to WWI histories such as anxiety, death, disability, and masculinity. It usually doesn’t take long before the student realizes that they are the same approximate age as the men whose letters they are editing, sometimes prompting revelatory and uncomfortable comparisons. In my own textual editing, I tend to use the <a href="https://www.oxygenxml.com/">Oxygen XML Editor</a>, which is proprietary but an academic license is not beyond reach. Once one becomes a bit more familiar with XPath and other XML family languages, its many features start to become addictive. However, students usually don’t need all that functionality, hence the desire for a fully open source work environment.</p>

<p>VS Code is already favored by students with a programming bent at my institution. Asking them to install a few VS Code extensions, and maybe Java, and—what the hell—Saxon HE too, is at least free if not easy. My instructions below are for the Mac operating system. If any courageous individual wants to write up instructions for Windows, please do so and then tell me about it! I know that I will need to point to Windows instructions soon.</p>

<h2 id="installations">Installations</h2>

<p>Go to <a href="https://code.visualstudio.com/download">code.visualstudio.com/download</a> and select the installer for your operating system. Open the zipped file. This should automatically launch the installation process. Double click on the new Visual Studio Code shortcut appearing on your desktop to ensure it opens as expected. Macs will give you a warning about it being an “app downloaded from the Internet” and are you sure? Yes. Click Open.</p>

<p>Alternatively, if you are already a <a href="https://brew.sh/">Homebrew</a> user, you can install VS Code in Terminal with the following command:</p>

<p><code class="language-plaintext highlighter-rouge">brew install --cask visual-studio-code</code></p>

<p>And then type <code class="language-plaintext highlighter-rouge">open -a "Visual Studio Code"</code>.</p>

<p>When you open VS Code, you will see a series of icons in the left menu. They are in descending order: Explorer, Search, Source Control, Run and Debug, and Extensions. We will focus on Extensions for now.</p>

<figure class=""><img src="/assets/images/vsc1.png" alt="visual studio code editor" /><figcaption>
      The left side navigation of Visual Studio Code

    </figcaption></figure>

<p>Click on Extensions and use the search bar to locate <a href="https://marketplace.visualstudio.com/items?itemName=raffazizzi.sxml">Scholarly XML</a>. Select it and press the install button. We will use Scholarly XML to check that our XML is well formed and valid. What does that mean? In a nutshell, we want our XML file to conform to the XML syntax rules and to use valid Text Encoding Initiative element and attribute names.</p>

<figure class=""><img src="/assets/images/vsc2.png" alt="scholarly xml extension" /><figcaption>
      Installing the Scholarly XML extension in Visual Studio Code

    </figcaption></figure>

<p>Verify that Scholarly XML is working as expected by opening a new text file in VS Code. Copy the XML snippet from the extension page under the <strong>Validation</strong> subheading and paste it at the top of your new text file. VS Code should autodetect that you are working on an XML document. If you get a red squiggle under the pasted snippet and a warning that you are missing the <code class="language-plaintext highlighter-rouge">teiHeader</code> tag, the validator is working!</p>

<figure class=""><img src="/assets/images/vsc3.png" alt="XML validation" /><figcaption>
      Working validation in VS Code: Huzzah!

    </figcaption></figure>

<p>We could stop here if we are content with the ability to ensure our XML is well-formed and valid against the TEI schema. However, error checking is much easier when you can transform your TEI XML into HTML for the web. With that in mind, we are going to install two more extensions in VS Code. Follow the same installation procedure as before.</p>

<ul>
  <li><a href="https://marketplace.visualstudio.com/items?itemName=george-alisson.html-preview-vscode">HTML Preview</a></li>
  <li><a href="https://marketplace.visualstudio.com/items?itemName=WashirePie.vscode-xsl-transform">XSL Transform</a></li>
</ul>

<p>The XSL Transform extension has two external dependencies that we will need to install separately. Those are <a href="https://www.java.com/en/download/">Java</a> and <a href="https://www.saxonica.com/download/download_page.xml">Saxon Home Edition or Saxon-HE</a>.</p>

<p>Download the Java installer for your operating system. Double click on the .dmg file to launch the installation process. See the <a href="https://www.java.com/en/download/help/mac_install.html">Java website</a> more detailed instructions.</p>

<p>Verify that you have Java installed on your machine. In a Terminal window, type</p>

<p><code class="language-plaintext highlighter-rouge">/usr/libexec/java_home</code></p>

<p>If you have more than one Java installation, try also</p>

<p><code class="language-plaintext highlighter-rouge">/usr/libexec/java_home -V</code></p>

<p>I have three installations. The one I use for XSL transformations is located at</p>

<p><code class="language-plaintext highlighter-rouge">/Library/Java/JavaVirtualMachines/jdk1.8.0_77.jdk/Contents/Home</code></p>

<p>Install Saxon-HE by downloading it from here: <a href="https://github.com/Saxonica/Saxon-HE">https://github.com/Saxonica/Saxon-HE</a>. As of this writing, the latest stable version is 11.5. Back in your Mac, use Finder to go to folder: <code class="language-plaintext highlighter-rouge">/Library/Java</code>. Within this directory, if you do not already have a folder called <code class="language-plaintext highlighter-rouge">Extensions</code> together with one called <code class="language-plaintext highlighter-rouge">JavaVirtualMachines</code> then create one now. Inside <code class="language-plaintext highlighter-rouge">Extensions</code> drag and drop the Saxon jar. For example, my Saxon is located at the following path: <code class="language-plaintext highlighter-rouge">/Library/Java/Extensions/saxon-he-10.6.jar</code> (yes, I’m a few versions behind).</p>

<h2 id="settings-configurations">Settings Configurations</h2>

<p>Back in VS Code under Extensions, click on the settings wheel for XSL Transform and select Extension Settings. Under <code class="language-plaintext highlighter-rouge">Xsl:processor</code> enter the path to your Saxon (again, mine is <code class="language-plaintext highlighter-rouge">/Library/Java/Extensions/saxon-he-10.6.jar</code>). <a href="https://github.com/rutgersdh/wsb-data/blob/master/processing/wsb_html_error-checking.xsl">Download this XSLT stylesheet</a> for the War Service Bureau project. Then, copy the path to where this stylesheet is located on your computer and enter it into the XSL Transform settings under <code class="language-plaintext highlighter-rouge">Xsl:stylesheet</code>. Mine is at <code class="language-plaintext highlighter-rouge">/Users/Francesca/projects/wsb-data/processing/wsb_html_error-checking.xsl</code>.</p>

<figure class=""><img src="/assets/images/vsc4.png" alt="XSL Transform extension" /><figcaption>
      Configuring the XSL Transform extension in Visual Studio Code

    </figcaption></figure>

<p>Open a sample letter anthology for the War Service Bureau in VS Code (e.g. <a href="https://github.com/rutgersdh/wsb-data/blob/master/letters/ainsworthwpe.xml">Ainsworth</a> or <a href="https://github.com/rutgersdh/wsb-data/blob/master/letters/brachereg.xml">Bracher</a>). With the letter anthology open, press Shift + Command + P to get the command palette. In the search bar, start to enter “transform” until you see the xsl.transform command. Press enter. Alternatively, use the XSL Transform keyboard shortcut (control + option + T on a Mac). An unnamed HTML document should open in VS Code. Save it as test.html or use the file prefix of the file you transformed (for example: voorheesjb.html). Next, with your cursor in the HTML document, return to the command palette to run HTML: Open Preview (this is part of the HTML Preview extension). It should generate an HTML preview inside of VS code. Alternatively, you can simply open the HTML file in the browser of your choice to inspect. Voilà!</p>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/jlZDadVx4YY" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<p>The screen capture above shows what the XSLT transformation looks like in VS Code.</p>

<p>Reading your encoding as HTML is much easier on the eyes and is good error checking practice!</p>]]></content><author><name>Francesca Giannetti</name></author><category term="pedagogy" /><category term="text encoding" /><summary type="html"><![CDATA[Instructions for setting up Visual Studio Code as an XML editor for use with the Text Encoding Initiative.]]></summary></entry><entry><title type="html">Post: Rutgers Joins the BTAA Geoportal</title><link href="https://francescagiannetti.com/blog/rutgers-joins-btaa-geoportal/" rel="alternate" type="text/html" title="Post: Rutgers Joins the BTAA Geoportal" /><published>2023-02-15T00:00:00-05:00</published><updated>2023-02-15T00:00:00-05:00</updated><id>https://francescagiannetti.com/blog/rutgers-joins-btaa-geoportal</id><content type="html" xml:base="https://francescagiannetti.com/blog/rutgers-joins-btaa-geoportal/"><![CDATA[<p>An overview of the records contributed by Rutgers, including a personal favorite from Special Collections.</p>]]></content><author><name>Francesca Giannetti</name></author><category term="Blog" /><category term="GIS" /><summary type="html"><![CDATA[An overview of the records contributed by Rutgers, including a personal favorite from Special Collections.]]></summary></entry><entry><title type="html">A Privilege</title><link href="https://francescagiannetti.com/personal-statement/" rel="alternate" type="text/html" title="A Privilege" /><published>2021-02-16T11:00:00-05:00</published><updated>2021-02-16T11:00:00-05:00</updated><id>https://francescagiannetti.com/a-privilege</id><content type="html" xml:base="https://francescagiannetti.com/personal-statement/"><![CDATA[<p>I let my promotion with tenure (Librarian II at my institution) pass without comment last year because it was a dreadful year in which some of my colleagues lost jobs and I felt more than ever like I had won some kind of a lottery just by happening to be in the right place at the right time. With that said, though, I know a lot of work goes into evaluating tenure cases, and I am grateful to my Rutgers colleagues and to my external referees for their careful attention and support. I also felt the need to share some of what went into my dossier, because I know how hard it is to articulate one’s value in a newer academic library role. I am the first digital humanities librarian at my institution and I am also subject librarian. I was acutely aware that DH looks a bit different at every institution and I was perhaps overly sensitive about the gaps in my own portfolio vis-à-vis whatever shared understanding of digital humanities could be said to exist. It was a profound help to me to be able to read other people’s professional statements as I was preparing my own, including those of <a href="https://doi.org/10.6084/m9.figshare.3413698.v2">Heather Coates</a> and <a href="https://ryancordell.org/statements">Ryan Cordell</a>. Huge props also to <a href="http://kalanicraig.com/dossier/">Kalani Craig</a> for publishing her statement after she submitted (and before hearing the result!). So with this post, I would like to share my own personal statement, in case others can benefit from seeing how I made sense of the various threads of my professional responsibilities.</p>

<p>It may help to preface this with the gap that concerned me the most. Project development or management is often among the more highly prized skills of digital humanities librarians.<sup id="fnref:fn1" role="doc-noteref"><a href="#fn:fn1" class="footnote" rel="footnote">1</a></sup> Due to the way my job is configured, and the way digital infrastructure works at my institution, I could not get too heavily involved in the management of collaborative digital research projects. I can consult, and I can train, but I often cannot be the one to manage the day-to-day and week-to-week labor of these projects, as this kind of work is extremely time-intensive, and, like more and more of us, I juggle many roles at my institution. And while I may have some misgivings about this fact, I am also grateful of the chance to consult on several smaller digital projects, and to develop a few of my own, which have become opportunities for the mentorship of student scholars interested in digital methods. Although it has taken me a while to settle into this truth, there is no one way of doing the job of a digital humanities librarian. We rely on our own professional interests and strengths, and the needs and interests of those with whom we regularly work, to shape our paths.</p>

<p><em>Personal statement, submitted August 9, 2019</em></p>

<h2 id="librarianship">Librarianship</h2>

<p>My greatest aim as a hybrid librarian active in digital humanities and subject liaison work is to bring both sides of my librarianship work into constructive, mutually beneficial dialogue, with the goal of reinvigorating public services librarianship through a critical understanding of twenty-first century challenges like authority, information overload, and intellectual property. As the first person to occupy the role of digital humanities librarian at Rutgers–New Brunswick, my work has contributed to many firsts for the Libraries. My interpretation of digital humanities librarianship emphasizes 1) fostering opportunities to learn diverse digital methods that will improve research and teaching, and 2) creating a community of practice in which experts on campus can connect and collaborate with each other. Defining digital humanities as the use of digital tools and methods to study the humanities, I have used my musical background as a disciplinary lens through which to communicate a range of research activities, such as the capture, creation, enrichment, analysis, and interpretation of data. At the same time, I possess the language expertise required to be a subject liaison in Comparative Literature, French, and Italian, and I have found many fruitful points of contact between my subject and functional library roles.</p>

<p>My activity as a digital humanities librarian focuses to a great extent on pedagogy. I am one of few librarians or technologists at Rutgers who collaborate with disciplinary faculty to deliver lectures and trainings on digital research methods in term courses at the graduate and undergraduate levels; I have created original material for courses in the fields of literature, history, musicology, and Latino and Caribbean studies. Humanities faculty interested in exploring digital humanities in partnership with a librarian constitute a new audience for the Libraries. I plan, organize, and teach stand-alone workshops in the libraries on the sources and methods of digital humanities in order to create and grow collaborations. These workshops have attracted broad participation from over 30 academic departments and programs in the humanities, the social sciences, and the sciences. Via these embedded and stand-alone workshops and lectures, I have reached nearly 1,500 students and faculty in the past five years. I teach a Byrne seminar called “Data Mining in the Humanities” that introduces digital humanities to first-year undergraduates; the <a href="http://dx.doi.org/10.17613/M6MW7B">syllabus</a>, deposited in the Big Ten DH group of Humanities Commons, has been downloaded over 200 times. I continue to develop my personal digital skill set, which includes humanistic applications of XML technologies, statistical programming, and Geographic Information Systems, through regular attendance of workshops, courses, and intensive summer institutes. As a result of my work, I was invited by faculty of the English department to help develop a new 300-level, team-taught course called “Data and Culture,” scheduled to be taught in 2020-21, that will provide a broad theoretical and practical overview of various topics in digital culture. I was also invited by senior administrators in Rutgers Global and the School of Arts and Sciences (SAS) Honors Program to develop a two-week seminar in Spain on the topic of text encoding and manuscript studies for the summer of 2020.</p>

<p>Digital humanities community building and research infrastructure inform my work on several library task forces and working groups. I helped launch, and I currently oversee the Digital Humanities Lab in Alexander Library, a cross-disciplinary research space supported by the Libraries and the School of Arts and Sciences, where students, faculty and staff meet to work on projects and learn about digital methods. I have served as chair and a member of the system-wide Libraries Digital Humanities Working Group. My work with this group involved the development of a Libraries-hosted digital publishing service that allows scholars to use WordPress and Omeka for course projects and the sharing of informal research; this service has been used by over 200 faculty and students across New Brunswick, Newark, and Camden. As a steering member of the Libraries’ Graduate Specialist program, in which graduate students supplement library-based consulting and training in advanced digital research methods, I supervise the work of a digital humanities graduate specialist who provides specialized assistance in text analysis methodologies using the R programming language. Together with graduate student conveners, I hosted a digital humanities reading group, running in Fall 2017, in which a core group of participants from across the humanities and social sciences explored the topic from a variety of perspectives including race and gender theory, pedagogy, and digital project development. I work with the Geography department, and the Rutgers Undergraduate Geography Society, among others, to host an annual mapathon in which participants contribute geospatial data to the Humanitarian OpenStreetMap platform to aid disaster relief efforts around the globe. I have sought to further the libraries’ important role in advanced research through my service as leader of the Research Spaces team, which studied specialized research spaces in academic library settings, and as a current member of the Research Data Services, Digital Projects, and Copyright teams.</p>

<p>As part of my liaison duties, I have met with faculty and graduate students in Classics, Comparative Literature, French, and Italian to discuss scholarly communication topics, such as open access, self-archiving, digital publishing, and author rights in their individual disciplines. We have discussed Scholarly Open Access at Rutgers (SOAR), the interface for depositing scholarly articles, and the relaunch or conversion of several departmental journals to a new digital platform. I have helped contacts in Art History, Comparative Literature, and Italian to transition their graduate student journals to the user-friendly WordPress platform, thereby facilitating the onboarding process for new student editors, extending the longevity of the publications, and providing a venue in which to practice digital publishing skills (see <em>Digital Projects: Project Consultant and Trainer</em> section of CV). I have advised graduate students in my liaison areas and in other departments on the creation of digital projects related to their doctoral research, and on secondary topics of interest, and I have helped these students to acquire new skills, develop their research agendas, and in a few cases start careers in digital humanities. Influenced in some part by my work with their graduate students, the Italian department has sought to expand their involvement in digital humanities by offering DH mini-seminars, taught by visiting professors, and by hiring a DH post-doctoral fellow.</p>

<p>I manage the collections for the departments of Classics, French, and Italian, and the program in Comparative Literature, in consulation with the faculty and graduate students of those disciplines, and I have applied data analysis skills acquired through my research activity to analyze library patron preferences in support of collection management work. For example, by analyzing logs of requests for interlibrary loan books, I discovered a high number of requests for books about digital humanities, as a result, I successfully argued for the creation of a special fund code for the acquisition of monographs in this area. Through similar analyses of requests for literature and literary criticism, I found a high volume of requests for public domain authors like Shakespeare, Virgil, and Euripides, whose texts are freely available online, indicating most likely that Rutgers users are requesting specific editions and translations, or they prefer the print medium for more immersive, cognitively demanding reading. This has, in turn, influenced my selective uptake of e-books in the humanities. I have developed and created subject-specific research guides in all of my liaison areas. I also help students conducting multilingual research to navigate the Libraries’ collections, external web resources, and citation styles and tools. In spite of a difficult environment for collection development at Rutgers, I have been able to make strategic additions to our film, monograph, and serials collections, providing enhancements in the areas of Francophone Caribbean and African literature and culture, French Early Modern literature, migration studies, Italian women writers, history of science, and ecocriticism and environmental humanities.</p>

<p>I participate weekly or biweekly in all forms of library public services, including chat, email, and desk reference at Alexander Library. In addition to delivering information literacy instruction in my liaison departments, I also teach library sessions for the Honors College, the SAS Honors Program, and the Rutgers Writing Program.</p>

<p>As a digital humanities librarian and subject specialist, I provide research expertise across multiple domains. I look forward to developing new collaborations in support of cross-disciplinary computational research, including one with the Rutgers Office of Advanced Research Computing (OARC), and to my continued work on the above-mentioned pedagogy projects with the School of Arts and Sciences.</p>

<h2 id="scholarship">Scholarship</h2>

<p>In my scholarship, I pursue topics at the intersections of information studies, digital humanities, and music in support of the activities of emerging interdisciplinary research communities. Applications of computing in music and sound studies are relatively rare, and often challenging because of the non-textual nature of most music information. My work aims to improve the dissemination of high-quality digital research outputs in music, broaden awareness of digital methods in the humanities more generally, and develop frameworks for the evaluation of such resources, with the goal of facilitating new, cross-disciplinary modes of inquiry. Across these topics, I seek to bring newer and established media into critical dialogue with each other by discussing how, for example, social media data can inform the digital preservation programs of archival collections, and demonstrating how the study of manuscripts and print culture can be enhanced by data modeling and the creation of online scholarly editions. In keeping with the interdisciplinary nature of my research, I have been cited by scholars in the fields of music business, music technology, science and technology studies, computer science, and literary studies. The open access versions of my work in RUcore, the Rutgers institutional repository, have been downloaded 200 times by researchers in 20 countries. My research intervention is twofold: advocacy for researcher training that includes material and theoretical engagements with technology, and documentation of the evolving nature of library work in connection with digital humanities research and teaching.</p>

<p>I take a particular interest in non-textual formats that are underutilized (as measured in citations) when compared to other library and archival sources, and have devoted a portion of my research activity to exploring the impact of digital sound recordings among researchers and the general public. I published an empirical study (<a href="http://dx.doi.org/10.1002/asi.23990">A Twitter Case Study for Assessing Digital Sound</a>) on the interactions of Twitter users with digital archival sound in the <em>Journal of the Association for Information Science and Technology</em>, a first quartile journal in information science (<strong>source</strong>: Scimago), and among the top three publication venues for  information school faculty. My study characterized a range of user interactions that pointed to impact that was not captured in the citation record. Since communicating the benefits of digitization programs is a <em>sine qua non</em> of successful funding at cultural heritage institutions, I sought with this study to model an approach to the capture and analysis of social media data that could be adapted by researchers with similar objectives.</p>

<p>I am committed to the inclusion of digital methods in the education and professional development of students, faculty, and librarians. A thread of my research deals with the subject of digital humanities pedagogy and the introduction of digital methods and tools. My article, <a href="http://dx.doi.org/10.1080/10691316.2017.1340217">“Against the Grain: Reading for the Challenges of Collaborative DH Pedagogy,”</a> published in <em>College and Undergraduate Libraries</em>, describes the common challenges of collaborative digital humanities pedagogy with the purpose of building a foundation of shared knowledge upon which DH practitioners can build. In <a href="https://acrl.ala.org/dh/2017/08/04/what-im-reading-this-summer-rebecca-dowson/">a post</a> published in the <em>dh+lib Review</em>, Rebecca Dowson of Simon Fraser University wrote of this article, “as Giannetti points out, critical reflections on the challenges of this work have not yet been a focus of scholarship. This gap in the literature is a shame… I hope others will take up Giannetti’s call to share our failures with each other…” This article was recently republished in a monograph entitled <em>The Digital Humanities: Implications for Librarians, Libraries, and Librarianship</em>. My article on text markup, data modeling, and pedagogy (<a href="https://doi.org/10.1016/j.acalib.2019.05.001">“‘So near while apart’: Correspondence Editions as Critical Library Pedagogy and Digital Humanities Methodology,”</a>) was recently published in the <em>Journal of Academic Librarianship</em>. In this article, I present two case studies on the pedagogical applications of the Text Encoding Initiative (TEI), and the value of librarian involvement in the process and products of TEI editorial work. As a pedagogical exercise, text encoding is an especially compelling way of introducing students to topics such as editorial theory and digital remediation, and librarians have unique knowledge of their collections the technical skills to make such pedagogical interventions successful introductions to the use of primary sources in historical and cultural research.</p>

<p>With colleagues at institutions in the U.S. and the U.K., I am developing a digital research environment called Music Scholarship Online (MuSO), a contributing node of the Advanced Research Consortium (ARC) at Texas A&amp;M University whose aims are to improve discovery of digital scholarly outputs in music as well as develop a peer review framework for the evaluation of digital work in musicology. Our case study entitled <a href="http://www.digitalhumanities.org/dhq/vol/13/1/000381/000381.html">“Music Scholarship Online (MuSO): A Research Environment for a More Democratic Digital Musicology”</a> discusses the problems of a closed canon, the underutilized potential of the digital medium, a reward system tied to print publication, and siloed research communities that continue to impact the adoption of digital research methods in musicology. This article was published in <em>Digital Humanities Quarterly</em>, an open-access, peer-reviewed journal covering all aspects of digital media in the humanities. In this article, we outline the ways in which MuSO plans to improve the dissemination of digital outputs in music, and thereby strengthen community standards in music representation, promote data reuse, and create possibilities for new research that expands the musicological discipline. As the MuSO project team pursues additional digital aggregation projects and forms an editorial board to implement standards of digital peer review, I am studying musical genre and form tags as a music-specific method of information retrieval in big digital libraries and databases. As part of this work, I have undertaken qualitative research on the information-seeking preferences of music scholars to develop a holistic model of musical genre for MuSO. In support of this work, I received a fellowship from the Center for Cultural Analysis at Rutgers–New Brunswick to participate in their 2018-19 Classification Seminar.</p>

<p>I present at national and international conferences, and I publish my work in a variety of open access venues and repositories in order to expand the visibility of my research. As an example, on the basis of a conference talk at the joint International Association for Music Libraries/International Musicological Society Congress, which I subsequently <a href="https://francescagiannetti.com/musical-multimodal-meanderings/">published on my blog</a>, I was invited to submit an article (<a href="http://dx.doi.org/10.1080/10588167.2016.1166842">“A Review of Network Approaches in Music Studies”</a>) introducing humanistic applications of social network analysis, published in <em>Music Reference Services Quarterly</em>. I review books on digital humanities in libraries to extend the reach of the scholarship of my peers. I anticipate several upcoming conference talks and appearances to further develop my work on TEI pedagogy and musical genre in online information systems.</p>

<h2 id="service">Service</h2>

<p>My service is an extension of my librarianship and scholarship activity; I help to build community, capacity, and infrastructure in digital humanities, often at the interstices of music studies.</p>

<p>As part of my engagement with the global digital humanities community, I have sought service roles across disciplines that create opportunities for the cross-fertilization that I believe will enliven these fields. I am a founding member of Music Scholarship Online (MuSO), described above, for which I participated on an international team to crosswalk metadata for eighteenth-century music objects from Europeana, the EU digital platform for cultural heritage, to the music-specific data model developed for MuSO. For three years in a row, I have served as a member of the international program committee for the Conference on Digital Libraries for Musicology, a cross-disciplinary presentation venue for researchers working on, and with, large-scale digital libraries and databases in the domain of music and musicology. I have worked on advisory committees for grant-funded projects at Big Ten institutions supporting the creation of a computational text analysis curriculum for academic librarians (<a href="https://teach.htrc.illinois.edu/">“Digging Deeper, Reaching Further”</a>) and the development of a data capsule appliance for non-consumptive research on the copyrighted corpus in the HathiTrust Digital Library. I write peer reviews for the major national and international digital humanities conferences, as well as for several serial publications on digital libraries and music librarianship, and I have served on a review committee for the National Endowment for the Humanities Office of Digital Humanities. My editorial role with the <a href="https://nimbletents.github.io/">Nimble Tents Toolkit</a>, directly related to the mapathons described earlier, involves reviewing new contributions on the organization of rapid response teams and grassroots events to address urgent climate and political challenges involving free or widely used digital tools and platforms.</p>

<p>As part of my ongoing involvement in the Music Library Association (MLA), I have helped plan and organize several pre- and post-conference activities involving a range of hands-on trainings and presentations in digital methods at the annual meetings. One of my workshops, <a href="https://francescagiannetti.com/a-workshop-on-maps-and-timelines/">“A Workshop on Maps &amp; Timelines,”</a> which I subsequently published online, has been highlighted on the website of the Stanford Humanities + Design lab as a testimonial of their software application, Palladio. I have served on the MLA Emerging Technologies and Services Committee, and as the current convener of the Digital Humanities Interest Group, I have led an effort to create a comprehensive list of digital libraries, digital archives, open datasets, and digital humanities projects in music.</p>

<p>My involvement as a steering committee member of the Rutgers Digital Humanities Initiative (DHI) has been a focal point of my local service activity. As the only librarian on the steering committee, my work with the DHI is discipline-neutral, and involves building a research community through the gathering, synthesis and dissemination of information on key events, lectures, conferences, and opportunities available at Rutgers and in the Greater New York area. In addition to organizing local workshops, many of which I also led, I have also planned lectures, open house events, and an annual symposium showcasing the digital humanities research of students and faculty at Rutgers and at member institutions of the New Jersey Digital Humanities Consortium. I ensure that materials from all workshops are archived and freely available online. These materials continue to generate interest well after the date of the event; one of the most visited pages (700 unique page views) of the Rutgers DHI website is the workshop I developed on <a href="https://dh.rutgers.edu//thematic-maps-in-qgis/">“Thematic Maps in QGIS.”</a> In addition, I have served on an interdisciplinary review committee for the award of digital humanities seed grants, funded by the Rutgers School of Arts and Sciences, to support the early stages of digital project work by Rutgers graduate students and faculty. This committee disbursed grants to ten recipients, whose work is presented at the annual symposium (“DH Showcase”) and on the <a href="https://dh.rutgers.edu/projects">DHI’s website</a>.</p>

<p>I take pleasure in mentoring and advising students working with digital methods. I have worked with Rutgers Future Scholars and Aresty Research Assistants on projects relating to text encoding and digital editions of correspondence collections. I also served on the Research Database Task Force, which examined ways of expanding and promoting undergraduate research opportunities at Rutgers–New Brunswick.</p>

<h2 id="conclusion">Conclusion</h2>

<p>As a digital humanities librarian with subject liaison responsibilities, I have sought to emphasize the commonalities of digital research methods with longstanding humanistic scholarship practices to transform teaching, learning, and research in a range of disciplines. Due to my interventions in pedagogy and outreach, I have helped to establish the Libraries as one of very few places at Rutgers for students and faculty to develop technical skills that are rarely taught in academic units. I have promoted the use of unique library collections and computational methods of analysis among scholars and students, and I have demonstrated their significance in peer-reviewed publications and presentations. I look forward to contributing to the ongoing transformation of library and humanistic practices at Rutgers University, among library professionals and researchers locally and globally in the years to come.</p>

<!-- 

> We can learn to work and speak when we are afraid in the same way we have learned to work and speak when we are tired. For we have been socialized to respect fear more than our own needs for language and definition, and while we wait in silence for that final luxury of fearlessness, the weight of that silence will choke us.[^fn1]

Lastly, I do not take lightly the privilege of comparative job security in a period when so many accomplished and smart people are struggling to find meaningful work. Since I am a bit shy by nature, I know that I have benefited from bucketfuls of privilege and luck to enjoy this modicum of professional success. And I am cognizant of the fact that I have a duty to use my voice to protect and promote the interests of my fellow library workers and those of the more precariously employed early career digital humanists who are facing one of the worst hiring seasons on record. So I've got my eye on projects like the [Visionary Futures Collective](https://visionary-futures-collective.github.io/), [Bearing Witness](https://bearingwitness.github.io/), and the [Academic Job Market Support Network](https://hcommons.org/groups/academic-job-market-support-network/). I am working with Grant Wythoff at the Princeton Center for Digital Humanities to get [Humanist Mutual Aid](https://humanistmutualaid.com/) up and running. And in my library and digital humanities activity, I cite the work of scholars of color, of women of color in particular, in part because I've been doing some catching up with this literature and it's on my mind, and in part because I find that the work is trenchant and exciting and compassionate, and we all deserve to see more of it. Although it's always time to invest in understanding the perspectives of different people, that activity is vitally important now. There have been a handful of scholars who I reliably read, no matter what the genre. They include Brittney Cooper, Tressie McMillan Cottom, Jessica Marie Johnson, and Marisa Parham. Lately, I've added Audrey Lorde, bell hooks, and Toni Morrison to the mix, in addition to... alright fine, I will out myself as a romance reader... Courtney Milan,  Rebekah Weatherspoon, and KJ Charles (not a BIPOC author, but she writes mostly LGBT romance). One of the marvelous things about having this eclectic mix of voices floating around in my head is... 

So with this meandering and unorthodox introduction, which through the alchemy of PageRank and SEO can never be separated from what follows (lol), I herewith present my personal statement.

I am deeply grateful for, and indebted to my professional network. Pre-tenure, I remember having had a conversation with my Rutgers friends about whether or not I had submitted any names from whom I preferred external letters *not* be requested. I had not. I don't have professional nemeses (as far as I know) and I have difficulty imagining any senior member of the digital humanities community *not supporting* the promotion of an up-and-coming member of the field. I like my specialization, and the people have a tremendous amount to do with that fact.

I am also cognizant of the fact that I have won a kind of lottery in which the winners are fewer and fewer each year. I did do the work that was required of me for promotion at my institution, but so have countless others who have not yet been successful at securing long-term employment in academic libraries. 

comments on odds of being shy and being comparatively successful professionally
impostor syndrome, race, and getting over it to speak and write and do things that need doing 

Being a shy person who is able to enjoy a modicum of professional success is yet another legacy of whiteness. 

When writing the last article I published before submitting my tenure packet, I struggled. Every day felt like a vicious street fight. Every day for weeks on end, the only product of my struggle was a single sentence or two. I wondered anew about how I had managed to get anything written or published before. And as I got up in the morning and sipped my first cup of coffee while staring blankly at the last mangled sentences on my laptop screen, I would hem and haw about how to continue. Not infrequently, a voice in my head would say: "I don't know what I think about this yet." And then: "HOW CAN I NOT KNOW WHAT I THINK ABOUT THIS YET?" I frequently got in my own way, berating myself for the fact that my thoughts were still taking shape (spoiler: thinking is not only allowed but encouraged when doing research). There were endless walks around my neighborhood trying to sort out my next move. This went on for a long time. Way too long of a time, it seemed to me, but the truth of the matter is that even a sentence a day starts to add up. Eventually, after many months, there was a full manuscript that I gave to a few colleagues to read and comment on. I followed most but not all of their suggestions, submitted to a journal, and received that rarest of academic prizes: an acceptance without revisions. 

I've accepted the fact that the research and writing will not get easier. The mere proposition of sharing my work with a public seems to conjure every last insecurity in me. I also know that the fear I face is largely irrational. There haven't been so many social penalties for speaking my mind---or at least not many that I wasn't willing to face---and never a looming threat to my physical safety. I've been lucky in this too. 


[^fn1]: Lorde, Audre. 1984. The Transformation of Silence into Language and Action. In _Sister Outsider: Essays & Speeches_ 44. Freedom, CA: The Crossing Press. [Orig. pub. 1978.]

-->

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:fn1" role="doc-endnote">
      <p>See Paige Morgan and Helene Williams’s analysis of digital humanities/digital scholarship librarian job descriptions and accompanying visualization at <a href="http://www.paigemorgan.net/the-expansion-development-of-dhds-librarians/">http://www.paigemorgan.net/the-expansion-development-of-dhds-librarians/</a>. <a href="#fnref:fn1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Francesca Giannetti</name></author><category term="promotion" /><category term="personal statement" /><summary type="html"><![CDATA[I let my promotion with tenure (Librarian II at my institution) pass without comment last year because it was a dreadful year in which some of my colleagues lost jobs and I felt more than ever like I had won some kind of a lottery just by happening to be in the right place at the right time. With that said, though, I know a lot of work goes into evaluating tenure cases, and I am grateful to my Rutgers colleagues and to my external referees for their careful attention and support. I also felt the need to share some of what went into my dossier, because I know how hard it is to articulate one’s value in a newer academic library role. I am the first digital humanities librarian at my institution and I am also subject librarian. I was acutely aware that DH looks a bit different at every institution and I was perhaps overly sensitive about the gaps in my own portfolio vis-à-vis whatever shared understanding of digital humanities could be said to exist. It was a profound help to me to be able to read other people’s professional statements as I was preparing my own, including those of Heather Coates and Ryan Cordell. Huge props also to Kalani Craig for publishing her statement after she submitted (and before hearing the result!). So with this post, I would like to share my own personal statement, in case others can benefit from seeing how I made sense of the various threads of my professional responsibilities.]]></summary></entry></feed>