CatDan Dev

Obtaining Full Titles

Posted in dev by gwyant on March 10, 2010

Sadly, there is no API to get a definitive list of SEP entries with their URLs and full titles, so I had to screenscrape all 1170 of them, using the SEP’s table of contents as an index, then using the archival/citation page of each entry to glean the full titles, since it is much smaller to transfer and parse than any given entry. Now I have a DB table with titles like “18th Century British Aesthetics” rather than “British, in the 18th century.” Also, each entry now has a unique ID that I can use in future tables to allow for faster calculations, rather than matching on the entry title.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.