Crowdsourcing Reflection


Reflection on crowdsourcing projects

There has been an increasing trend in the use of public involvement for online crowdsourcing projects. Holley writes that crowdsourcing enables a large group of users to work together on a collaborative project which with a particular goal and which often brings significant results.[1] Galaxy Zoo is a good example of a simple crowdsourcing project, where users with an interest in astronomy are asked to identify an object from a dataset created by the Sloan digital Sky Survey.[2] Users are asked to classify an object from one of three images they think most closely resembles it. This enables the user to participate in the project interactively and assists the creators of the project by providing information. There are now an increasing number of cultural and heritage based crowdsourcing projects, Old Weather [3] and Transcribe Bentham[4] are two such projects which have a dedicated public following, subscribers to these projects and others like them are now being termed as citizen historians.

Both of these projects are interesting and offer a real sense of contributing to a significant result. The Old Weather project requires the user to transcribe ships logs and input information such as dates, weather, wind forces and ships positions. The website is well laid out and has an easy to follow tutorial, with a link to more experienced users who are happy to offer advice. The project does offer a sense of competition with users being able to progress from cadet to captain depending on how many logs they transcribe. However it is quite a narrow field of interest and the logs can be somewhat repetitive. I think that this type of project is only of interest to someone with a specific interest in maritime history or climate change. The crowdsourcing project Transcribe Bentham was established in order to speed up the amount of papers transcribed and to engage the public with Bentham’s ideas. Transcribe Bentham is different from Old Weather as it requires users to transcribe complex manuscripts which are often hard to decipher due to the handwriting. Transcribe Bentham appeals to both academics and citizen historians, its site is quite difficult to navigate for first time users but the online tutorial is informative and helpful.  In addition to this the site also offers practice manuscripts which enable users to build up confidence and skills. I think that Transcribe Bentham takes the work more seriously as all transcriptions have to be sent for approval first before they can be published, this is not the case with Old Weather but this may be because Transcribe Bentham is more of an academic site.  I enjoyed transcribing on both these sites, but I preferred Transcribe Bentham as the manuscripts were more interesting for an historical point of view.

[1] R. Holley, ‘Crowdsourcing: How and Why Should Libraries Do It?’ D-Lib Magazine, vol. 16, no. (2010).

2‘Galaxy Zoo’.; consulted 18 March 2015

3 Old Weather – Our Weather’s Past, The Climate’s Future’,; consulted 12 March 2015

4‘Transcribe Bentham: Transcription Desk’,; consulted 11th March 2015

Reflection of the use of the Oxford Knights Archives

Reflection on The Oxford Knights Archive as a mapping exercise

‘The Oxford Knights Archive (2014-15)’ is a dataset compiled by Corey Albone, Jack Dunne, Namiluko Indie and Bethany Reid last year during their digital history classes.  The dataset contains information on a group of students who graduated from Oxford University before 1715, the information comes from a published book called Alumni Oxoniensis[1] that has been digitized by British History Online. The dataset contains the date of births of all graduates from Oxford University who then went on to become knights.  During our class we were required to create a Google fusion table[2], import the data and then make some changes.  The information was very interesting, however one disadvantage was the amount of differences individual users had when it came  to ambiguous percentages.  All the students in my workshop worked on this at the same time and the percentage varied from between one to six percent.  This is a disadvantage because it does not give a standard result for all users. I did the exercise at home putting in the same data, made the same changes and received a three percent ambiguous result, whereas in class I had a six percent result. The fact that this happened means that google is deciding on what is an ambiguous result each time.

One of the objectives was to create a heat map which showed us the area in which graduates were born and by looking at our results  we were able to discern how far away they lived from Oxford University. If we look at the results from a historical perspective,  generally the majority of the graduates lived quite closely to Oxford and were mainly located in England.  Interestingly  there were no graduates from the East Anglia area, which happens to be nearer geographically  to Cambridge University,  so we could presume that people in that area went to Cambridge instead of Oxford because of the distance.  However as we do not have a dataset with the required Cambridge graduates details, we have to assume that this may have been the case and we can’t class it as a fact.  In addition to this there were very few graduates from Wales showing on the heat map.  However this is an expected result as   Wales at the time was an underpopulated country in relation to England. I also created a heat map for Exeter University and the majority of students for that university were situated in the south-west of England.  So these findings could indicate that during this period, most students went to the nearest university and didn’t not travel far from home. This could be because the roads were not felt safe to travel on, during to civil unrest and fear of crime.

In addition to this there was a mistake that showed up on the dataset regarding one graduate called Richard Breame, his entry showed that he was born in Vancouver, Canada which we know is incorrect because the graduates died a long time ago before Vancouver was established.  The information has misinterpreted the word Surrey as a town of the same name located in Vancouver. The problem is that the computer sees the word Surrey as ambiguous so it has to be changed by stating that it is Surrey, England. Therefore this illustrates that the results may not always been a hundred percent accurate and we need to use them with caution.  Furthermore sometimes it is important to be more specific with name places to ensure you obtain the most accurate result possible.

I think that using a dataset and uploading it onto a Google fusion table can give you some interesting results and also enables speedier research.  For instance it would have taken longer to trace the birthplace of every person who went to Oxford University manually.  By using the heat map function it enables you to see the results instantaneously and gives a clear picture of location and proximity to universities.  However as previously mentioned the results can vary between users and sometimes more specific data needs to be used in terms of accuracy.  Generally I think that using Google fusion tables to interpret historical date is a useful tool for historians.

[1] ‘THE BRITISH LIBRARY – The World’s Knowledge’,; consulted 29 March 2015.

[2] ‘About Fusion Tables – Fusion Tables Help’,; consulted 3 April 2015.

strengths and weaknesses of the approaches used by the British Newspapers 1600-1950 and British History Online

Reflect on the different strengths and weaknesses of the approaches used by the British Newspapers 1600-1950 (part 2) and British History Online (part 3).

The British newspapers of this period contain reports, advertisements and articles on a wide-ranging subject matter.  The newspapers cover national and local news as well as international affairs and politics.  All of this information is an invaluable tool for   historians to understand daily life and provides an insight into the past.  There is also a  useful website[1]  which holds around three million digitized pages of newspaper content from 408 newspapers . However when looking at the coverage by newspapers of earlier periods, there are some weaknesses to note.   One weakness is the fact that many newspapers copied each other’s stories so there was a large amount of repetition, subsequently it could be implied that there was a lack of originality in reporting the news.  In addition to this local newspapers often copied news published in the large national papers, so there was little variation in local publications.  Furthermore some of the articles are very short and there is often no evidence with regard to the origins of the article, which makes them difficult to authenticate.  When it comes to spelling there are many differences in spelling when compared to modern-day English.  For example the long ‘S’ was often used and is often mistaken for an ‘f’, this can cause difficulties in understanding text.     Another example is the misspelling of words, for instance the word Irish can also be found under spellings such as I rifh and I rish.  Subsequently this can make it difficult when researching as it can lead to sources being missed.  In addition to this it could also be said that it can be more time consuming to go through all documents with more than one spelling of a keyword.  However this does bring up more sources so it could just be a personal preference as to whether quality is better than quantity.

British History online is a non-profit organisation that was founded by the Institute of Historical Research in 2003.   Its main purpose is to bring together material from museums, libraries, archives and collections and make it accessible for both the public and the academic community.  The website has secondary and primary sources   from the medieval period up to the twentieth century, and also has maps, guides and images to aid research. The site focuses on British history including colonial history and that of the Norman invasion.  The site is well-regulated and also uses a double re typing method which keeps mistakes to a minimum. So unlike the Connected Histories website there is a much lower chance of bringing up sources which contain misspellings. However this could be seen as a weakness as there is a chance of missing something interesting.  Another point worth noting is the fact that the site only focuses on British history, so there is very limited information about international affairs unless directly connected to British history.  Nonetheless the site does have an extensive list of external resources which can provide further information.  Sources found on British History online are allowed to be referenced but there is a proviso that no more than a few lines are reproduced unless permission is given, so there may be a wait for permission to be given.

[1] ‘Search The Archive: British Newspaper Archive’,; consulted 2 April 2015

Texas Slavery Project

The Texas Slavery project.

The Texas slavery project[1] examines the growth of the slave population during the nineteenth century in the state of Texas. The project is centred on databases containing information regarding slave and slaveholder populations during the period spanning 1837 – 45. The site is easy to navigate and is well presented with easy to read fonts and  also  offers an interactive tool where users can search maps to research slave and slaveholders population statistics, In addition to this when it comes to the evaluation of the website it is easy to check by using the following criteria as recommended by Kupersmith.[2] The domain of the website is an .org so we know that is a Non-profit organisation and so the website is not looking to make money from its findings. It also has a link on the page about the project which then links of the members of the project and lists their qualifications and authority. These signs indicate that is a valid website and trustworthy.  Furthermore the sources that the site uses are digitised primary sources which can easily be authenticated. The sources the website uses well organised and can be easily found under sub headings such as The Laws of Texas, The James F. Perry Papers and the Diplomatic Correspondence of the Republic of Texas.  In addition to this the project also has primary sources from newspapers during the period, The Telegraph & Texas Register and The Civilian & Galveston Gazette.    The Telegraph & Texas Register has editorials on slavery, the cotton market and news articles regarding the annexation of Texas to the United States.   The Civilian & Galveston Gazette newspaper includes news regarding sugar and cotton production and also has advertisements for slaves.  In addition to this it is also a good source for information regarding the   passing of laws in Galveston restricting movement of African Americans.   Subsequently there is a broad and varied range of sources to aid research.

The project commenced in 2007 and was founded by Dr Andrew Torget who is also the director.  There are four additional members of support staff, however only one of these is an academic historian and at the time of the project was still completing his PhD.  So whilst the project is supported well by technical staff, there are only two members who are academic historians. The project is sponsored by two companies, The Summerlee foundation and The Virginia Centre for Digital History.  Whilst the Summerlee foundation provides financial support it is not a historical institution, and the Virginia centre only provides web and IT support.  So it could be implied that the responsibility for historical accuracy lies with only one person Dr Turget.  In addition to this there are no scanned images of the primary sources but considering that the site is almost ten years old it would have been considerably more expensive to have paid for the amount of storage space required. There are now better compression technologies available today for images, this coupled with superfast broadband means that it is easier to deliver high quality images to a wider range of uses.   Furthermore the site does provide information for the sources so it would not be difficult to find extra information independently.  It should also be noted that the Texas Slavery Project was honoured as a project in digital scholarship at the annual Nebraska Digital Workshop, which was held in October 2007 at the University of Nebraska.  So this illustrates that the project has been professionally recognised by its contemporaries in Digital History.

[1] ‘Texas Slavery Project’,; consulted 18 April 2015.

[2] J.Kupersmith, ‘Evaluating Web Pages: Techniques To Apply & Questions To Ask’. Lib.Berkeley.Edu.;/ consultted 30 March 2015.