the visible archive: Series Title Cloud

Series Title Cloud

Posted by Mitchell at 8:17 pm

So, it's no Wordle, but it's my first text cloud. This visualises the 250 most common words in the titles of each series in the initial dataset. It's also the first time I've mined the titles for data, and another step in the process of feeling out the attributes of this dataset. I've excluded a few "stop" words ("and","with","the","for") and anything with less than three characters, but otherwise this is a raw representation of the titles.

It shows first of all that the most frequently occuring terms in series titles are either generic descriptors ("files", "correspondence") or metadata, referring to the organisation or structure of the series, rather than its content ("alphabetical", "prefix", "single"). But then after the top twenty or so words, there's a large number of more descriptive terms. The difference in scale between these layers is significant; for example "series" and "files" occur in around 10,000 series titles (about a third of all series), whereas "drawings" occurs in around 800 series, and "HMAS", "Papua" and "Lighthouse" all occur in around 200 series. Some odd features show up as well, for example "Yokohama" and "specie"; turns out there are a large number of series consisting of records from the Yokohama Specie Bank, a Japanese bank involved in trade with China and Australia around the mid-C20th - it gets a mention in this 1940 telegram from Menzies to the High Commissioner in London. I wonder how the records ended up in the Archives?

Next, to try integrating text clouds as interfaces / overlays for the previous visualisations.

1 comments:

Unknown said...

The Yokohama Specie Bank records are in the National Archives because of enemy property regulations created during World War II. The position of Controller of Enemy Property was established to administer the regulations (which came into effect in September 1939). When Japan entered the war in 1941, Japanese assets in Australia were seized. There are about 60 Yokohama Specie bank series listed in RecordSearch under the Controller of Enemy Property. Other Japanese company records include those of Mitsui Pty Ltd (Melbourne) and the Japan Cotton Trading Company. There's more on all of this the Archives' guide to Japanese records by Pam Oliver, Allies, Enemies and Trading Partners.

27 October 2008 at 2:16 pm

the visible archive

creative research on the visualisation of archival datasets

Series Title Cloud

1 comments:

About

Visualisations

related

Labels

Blog Archive

Subscribe