The Java API Migration Corpus is a collection of change data for Java libraries, that is available for use by researchers and software engineering professionals who are interested in addressing the problem of library migration. The dataset describes all of the binary incompatible changes between each API version, as well as the what the replacement functionality should be for those changes.
This data is copyrighted and licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Researchers are welcome and encouraged to download the data, and make use of it in the evaluation of techniques and prototype tool support of assisting developers with problems caused by library migration.
Current Version: 1.1, Released: November 2014
The 1.1 version of the Library Migration Corpus comprises data for the following APIs:
- Apache Struts
- v1.0.2, 1.1.0, 1.2.4, 1.2.6, 1.2.7, 1.2.8, 1.2.9
- jDOM
- v1.0.b6, 1.0.b7, 1.0.b8, 1.0.b9
- log4J
- v1.0.4, 1.1.3, 1.2.1, 1.2.2, 1.2.4, 1.2.5, 1.2.6, 1.2.7, 1.2.8, 1.2.9
- Lucene
- v3.1.0, v3.2.0, v3.3.0, v3.4.0
Java API Library Migration Corpus by Bradley E. Cossette, Robert J. Walker is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. By downloading this data, you agree to the terms and conditions set forth in the Creative Commons licensing agreement.
Download LibraryMigrationCorpus1.1.zip, updated to November 2014.
This is the version used in Cossette's PhD thesis.
Version 1.0, Released: November 2012
The 1.0 version of the Library Migration Corpus comprises data for the following APIs:
- Apache Struts
- v1.0.2, 1.1.0, 1.2.4, 1.2.6, 1.2.7, 1.2.8, 1.2.9
- jDOM
- v1.0.b6, 1.0.b7, 1.0.b8, 1.0.b9
- log4J
- v1.0.4, 1.1.3, 1.2.1, 1.2.2, 1.2.5, 1.2.6, 1.2.7, 1.2.8, 1.2.9
Java API Library Migration Corpus by Bradley E. Cossette, Robert J. Walker is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. By downloading this data, you agree to the terms and conditions set forth in the Creative Commons licensing agreement.
Download LibraryMigrationCorpus1.0.rar. [Note that this version was reconstructed from raw files on the basis of their timestamps. While we suspect that this is the true version, we cannot guarantee it.]
This is the version used in our FSE 2012 paper.
Providing Feedback and/or Additional Data
The usefulness of this data set improves as other researchers and software engineering professionals provide feedback on the correctness of the data, and/or provide additional datasets from their own investigation of other Java API migrations. We welcome any and all contributions to this data, and are happy to include your name as part of the list of contributors who have built this dataset.
As new data is added and/or corrections made to the data, we will update the version number of the latest release as follows: [major].[minor].[revision]
- [minor] releases represent the addition of an API, or the addition of another set of API migrations for one or more existing API's.
- [revision] releases represent the correction of existing data: a better replacement was found for a change described in the data set.
Please e-mail corrections or new data to Robert Walker (This email address is being protected from spambots. You need JavaScript enabled to view it.), with the subject line: Library Migration Corpus.
Publications
- Seeking the ground truth: A retroactive study on the evolution and migration of software libraries. In Proceedings of the ACM SIGSOFT International Symposium on the Foundations of Software Engineering, 2012.
- Dependency Detection and Migration in Software Systems and Libraries. PhD thesis. Department of Computer Science, University of Calgary, September 2014.