Prague astronomical clock
Prague astronomical clock By Steve Collis from Melbourne, Australia (Astronomical Clock Uploaded by russavia) [CC BY 2.0], via Wikimedia Commons

In one of my previous blog posts on “How to write dates?” I discussed the basic universal date and time notation, as specified in the International Organization for Standardization standard (ISO 8601) and its Word Wide Web Consortium (W3C) simplification. Since that time the Library of Congress has completed the work on the extension of this standard, the Extended Date/Time Format (EDTF) 1.0. This extension for the most part deals with expressing uncertain dates and times. Such limited or imprecise date/time information is common occurrence in recording historical events in archives libraries etc. The ISO 8601 does not allow for the expression of such concepts as “approximately  year 1962” or “some year between 1920 and 1935” or “the event occurred probably in may 1938, but we are not certain”. The EDTF standard, allows us to express them in a formalized way,, fulfilling a real need in many fields dealing with historical metadata.

Despite the fact that the standard is relatively new, and there are few software tools to help enter or validate the uncertain dates and time data, I believe, that it is worth familiarizing oneself with the new notation wherever possible.


I would like to to begin with some definitions to facilitate the discussion of the new notation. The definitions are accompanied by symbols that will be used in the next section. 


Precision is a measure of a range or interval within which the ‘true’ value exists [1]. Precision is explicit in the date or date/time expression; if an event occurred in the year 1318, the precision is one year (it could occur at any time within this year). If we specify 1945-09-15, the precision is one day, etc. [2] In EDTF we can extend this definition to a specify a decade or century precision using the x symbol - see discussion of masked precision below.

Approximate (~)

An estimate that is assumed to be possibly correct, or close to correct, where “closeness” may be dependent on specific application.

Uncertain (?)

We are not sure of the value of the variable (in our case date or time). Uncertainty is independent of precision. The source of the information may itself not be reliable, or we may face several values and not enough information to discern between them. For example we may be uncertain as to the year, or month, or day of an event etc.

Unspecified (u)

The value is not stated. The point in time may be unspecified because it did not occur yet, because it is classified, unknown or for any other reason.

W związku z przeprowadzką do nowej siedziby Instytut organizuje wyprzedaż używanych mebli, książek i artykułów gospodarstwa domowego. Do nabycia: szafy, komody, łóżka, materace, biurka, sztućce, półki, stoły, pralka z suszarką, pościel oraz inne artykuły. Do oddania metalowe szafy na dokumenty.

Wyprzedaż odbędzie się sie w dawnej siedzibie Instytutu na 180 Second Avenue:

  • w piątek 24 kwietnia od godz 9:00 do 20:00
  • w sobotę 25 kwietnia od godz 10:00 do 17:00

Większe obiekty do obejrzenia na naszej stronie. Transport we własnym zakresie.

Bardzo okazyjne ceny!


Tel: 212-505-9052 oraz 845-520-1092

Part II: Product

(Guest blog by Rob Hudson)

Arthur Rubinstein (Linked Data)In Part I of this blog, I began telling you about my experience transforming Carnegie Hall’s historical performance history data into Linked Open Data, and in addition to giving some background on my project and the data I’m working with, I talked about process: modeling the data; how I went about choosing (and ultimately deciding to mint my own) URIs; finding vocabularies, or predicates, to describe the relationships in the data; and I gave some examples of the links I created to external datasets.

In this installment, I’d like to talk about product: the solutions I examined for serving up my newly-created RDF data, and some useful new tools that help bring the exploration of the web of linked data down out of the realm of developers and into the hands of ordinary users. I think it’s noteworthy that none of the tools I’m going to tell you about existed when I embarked upon my project a little more than two years ago!

As I’ve mentioned, my project is still a prototype, intended to be a proof-of-concept that I could use to convince Carnegie Hall that it would be worth the time to develop and publish its performance history data as Linked Open Data (LOD) — at this point, it exists only on my laptop. I needed to find some way to manage and serve up my RDF files, enough to provide some demonstrations of the possibilities that having our data expressed this way could afford the institution. I began to realize that without access to my own server this would be difficult. Luckily for me, 2014 saw the first full release of a linked data platform called Apache Marmotta by the Apache Software Foundation. Marmotta is a fully-functioning read-write linked data server, which would allow me to import all of my RDF triples, with a SPARQL module for querying the data. Best of all, for me, was the fact that Marmotta could function as a local, stand-alone installation on my laptop — no web server needed; I could act as my own, non-public web server. Marmotta is out-of-the-box, ready-to-go, and easy to install — I had it up and running in a few hours.


Rob HudsonRob Hudson - Photo by Gino Francesconi

Part I: Process

(Guest blog by Rob Hudson)

My name is Rob Hudson, and I’m the Associate Archivist at Carnegie Hall, where I’ve had the privilege to work since 1997. I’d like to tell you about my experience transforming Carnegie Hall’s historical performance history data into Linked Open Data, and how within the space of about two years I went from someone with a budding interest in linked data, but no clue how to actually create it, to having an actual working prototype.

First, one thing you should know about me: I’m not a developer or computer scientist. (For any developers and/or computer scientists out there reading this right now: skip to the next paragraph, and try to humor me.) I’m a musician who stumbled into the world of archives by chance, armed with subject knowledge and a love of history. I later went back and got my degree in library science, which was an incredibly valuable experience, and which introduced me to the concept of Linked Open Data (LOD), but up until relatively recently, the only lines of programming code I’d ever written was a “Hello, World!” - type script in Basic — in 1983. I mention this in order to give some hope to others out there like me, who discovered LOD, thought “Wow, this is fantastic — how can I do this?”, and were told “learn Python.” Well, I did, and if I can do it, so can you — it’s not that hard. Much harder than learning Python — and, one might argue, more important — is the much more abstract process of understanding your data, and figuring out how to describe it. Once you’ve dealt with that, the transformation via Python is just process — perhaps not a cakewalk, but nonetheless a methodical, straightforward process that you can learn and tackle, step by step.

Now let me tell you a bit about the data that I worked with for my linked data prototype. The Carnegie Hall Archives maintains a database that attempts to track every event, both musical and nonmusical, that has occurred in the public performance spaces of Carnegie Hall since 1891. (Since the CH Archives was not established until 1986, there are some gaps in these records, which we continue to fill in using sources like digitized newspaper listings and reviews, or missing concert programs we buy on eBay.) This database now covers more than 50,000 events of nearly every conceivable musical genre: classical, folk, jazz, pop, rock, world music, and no doubt some I’m overlooking.  But Carnegie Hall has always been about much more than music; its stages have also featured dance and spoken word performances, as well as meetings, lectures, civic rallies, political conventions — there was even a children’s circus, complete with baby elephants, in 1934. Our database has corresponding records for more than 90,000 artists, 16,000 composers and over 85,000 musical works. Starting in 2013, we began publishing some of these records to our website, where you can now find the records for nearly 18,000 events between 1891 and 1955.  The limited release reflects our ongoing process of data cleanup, and we’re continuing to publish new records each month.  For my linked data prototype, I chose to use this published data set, since I knew it was good, clean data.