Citing SmartSHARK

There are multiple publications on SmartSHARK and and the curation of data through releases. Depending on which tools or data from the SmartSHARK ecosystem you are using, different citations are appropriate, possibly even multiple citations to give proper credit to the people involved in different projects. This guide should help you to decide what to cite.

Usage Citation
You compare SmartSHARK to other repository mining tools, are interested in the general architecture of SmartSHARK, or the implications on external validity that motivate our work. @article{Trautsch2017,
author = {Fabian Trautsch and Steffen Herbold and Philip Makedonski and Jens Grabowski},
title = {Addressing problems with replicability and validity of repository mining studies through a smart data platform},
journal = {Empirical Software Engineering},
year = {2017},
volume = {23},
number = {2},
pages = {1036–1083},
doi = {10.1007/s10664-017-9537-x},
publisher = {Springer Science and Business Media {LLC}}
You are using SmartSHARK for data collection or validation or want to state that SmartSHARK was used by other for this purpose. @article{Trautsch2017,
author = {Fabian Trautsch and Steffen Herbold and Philip Makedonski and Jens Grabowski},
title = {Addressing problems with replicability and validity of repository mining studies through a smart data platform},
journal = {Empirical Software Engineering},
year = {2017},
volume = {23},
number = {2},
pages = {1036–1083},
doi = {10.1007/s10664-017-9537-x},
publisher = {Springer Science and Business Media {LLC}}
You are using the VisualSHARK to manually validate or analyze data @inproceedings{Trautsch2020,
author = {Trautsch, Alexander and Trautsch, Fabian and Herbold, Steffen and Ledel, Benjamin and Grabowski, Jens},
title = {The SmartSHARK Ecosystem for Software Repository Mining},
year = {2020},
booktitle = {Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Companion Proceedings},
pages = {25–28},
doi = {10.1145/3377812.3382139},
publisher = {Association for Computing Machinery}
You are using the SmartSHARK MongoDB 1.0. @Misc{Herbold2019,
author = {Steffen Herbold and Alexander Trautsch and Fabian Trautsch and Benjamin Ledel},
title = {Issues with SZZ: An empirical assessment of the state of practice of defect prediction data collection},
year = {2019},
archiveprefix = {arXiv},
eprint = {1911.08938},
You are using the SmartSHARK MongoDB 1.1, but you are not using the manual validation of changes on a line level. @Misc{Herbold2019,
author = {Steffen Herbold and Alexander Trautsch and Fabian Trautsch and Benjamin Ledel},
title = {Issues with SZZ: An empirical assessment of the state of practice of defect prediction data collection},
year = {2019},
archiveprefix = {arXiv},
eprint = {1911.08938},
You are using the SmartSHARK MongoDB 1.2 and are using only the manual validation of changes on the line level but no other data. @misc{Herbold2020,
author={Steffen Herbold and Alexander Trautsch and Benjamin Ledel and Alireza Aghamohammadi and Taher Ahmed Ghaleb and Kuljit Kaur Chahal and Tim Bossenmaier and Bhaveet Nagaria and Philip Makedonski and Matin Nili Ahmadabadi and Kristof Szabados and Helge Spieker and Matej Madeja and Nathaniel Hoy and Valentina Lenarduzzi and Shangwen Wang and Gema Rodríguez-Pérez and Ricardo Colomo-Palacios and Roberto Verdecchia and Paramvir Singh and Yihao Qin and Debasish Chakroborti and Willard Davis and Vijay Walunj and Hongjun Wu and Diego Marcilio and Omar Alam and Abdullah Aldaeej and Idan Amit and Burak Turhan and Simon Eismann and Anna-Katharina Wickert and Ivano Malavolta and Matus Sulir and Fatemeh Fard and Austin Z. Henley and Stratos Kourtzanidis and Eray Tuzun and Christoph Treude and Simin Maleki Shamasbi and Ivan Pashchenko and Marvin Wyrich and James Davis and Alexander Serebrenik and Ella Albrecht and Ethem Utku Aktas and Daniel Strüber and Johannes Erbel},
title={Large-Scale Manual Validation of Bug Fixing Commits: A Fine-grained Analysis of Tangling},
You are using the SmartSHARK MongoDB 1.2 and are using the manual validation of changes AND other data, e.g., manually validated issue types, inducing changes, or even mailing list data. @Misc{Herbold2019,
author = {Steffen Herbold and Alexander Trautsch and Fabian Trautsch and Benjamin Ledel},
title = {Issues with SZZ: An empirical assessment of the state of practice of defect prediction data collection},
year = {2019},
archiveprefix = {arXiv},
eprint = {1911.08938},


author={Steffen Herbold and Alexander Trautsch and Benjamin Ledel and Alireza Aghamohammadi and Taher Ahmed Ghaleb and Kuljit Kaur Chahal and Tim Bossenmaier and Bhaveet Nagaria and Philip Makedonski and Matin Nili Ahmadabadi and Kristof Szabados and Helge Spieker and Matej Madeja and Nathaniel Hoy and Valentina Lenarduzzi and Shangwen Wang and Gema Rodríguez-Pérez and Ricardo Colomo-Palacios and Roberto Verdecchia and Paramvir Singh and Yihao Qin and Debasish Chakroborti and Willard Davis and Vijay Walunj and Hongjun Wu and Diego Marcilio and Omar Alam and Abdullah Aldaeej and Idan Amit and Burak Turhan and Simon Eismann and Anna-Katharina Wickert and Ivano Malavolta and Matus Sulir and Fatemeh Fard and Austin Z. Henley and Stratos Kourtzanidis and Eray Tuzun and Christoph Treude and Simin Maleki Shamasbi and Ivan Pashchenko and Marvin Wyrich and James Davis and Alexander Serebrenik and Ella Albrecht and Ethem Utku Aktas and Daniel Strüber and Johannes Erbel},
title={Large-Scale Manual Validation of Bug Fixing Commits: A Fine-grained Analysis of Tangling},
You are using the SmartSHARK MongoDB 2.0 and are using data from the 28 new projects, the new defect label, or the new inducing changes @Misc{Trautsch2021,
author = {Alexander Trautsch, Fabian Trautsch, Steffen Herbold},
title = {The SmartSHARK Repository Mining Data},
year = {2021},
archiveprefix = {arXiv},
eprint = {2102.11540},