ABCDEFGHIJKLMNOPQRSTUVWXYZAA
1
User's emailTool name and versionTool typeTool referencesUse case short descriptionUse case documentationUser's affiliationFeedbackTaDiRAH@dropdown
2
Your name, as the person providing this piece of information
Your email, just in case we need to check some information with you
The computational tool, software or library whose use you are going to describe. Please use unique referent identifiers (Handle, Pid, ...) whenever available and specifiy the version if applicable.Specify what type of tool it is, web application, command line tool, library, ...Add bibliography and or link to web page of the tool, and or page of existing repositoriesThe real world DLS use case (research, teaching) where the tool was used. Notice: PLEASE DO NOT INTRODUCE A TOOL DEVELOPED BY YOURSELF OR AD-HOC TOOLS. We are interested in recent use cases (within the last five years) of off-the-shelf tools. - Create one line per use case/tool.Insert here reference to 1) published research (papers, books, blog posts) with links or 2) projects, or 3) academic courses - please use unique identifier of the publication (ISBN, DOI...) for publications whenever possibleAffiliation of the people using the tool at time of RESEARCH/USE, Institution - City - CountryBrief and constructive report on strengths and weaknessess of the tool/version that you used. Usability with regard to your research questionOptional. Add link to TaDiRAH taxonomy http://tadirah.dariah.eu/vocab/
3
Francesca Frontinifrancescafrontini@gmail.comFactoMiner 1.24R libraryHusson 2011 Husson, F., Josse, J., Le, S., and Mazet, J. "FactoMineR: Multivariate Exploratory Data Analysis and
Data Mining with R", R package version 1.24 (2011). http://factominer.free.fr/
Stylistic comparison of four French novels. The FactoMiner R library was used to perform CA on a set of extracted syntactic patterns in order to compare the style of four French novel.Francesca Frontini, Mohamed Amine Boukhaled, Jean-Gabriel Ganascia:
Mining for characterising patterns in literature using correspondence analysis: an experiment on French novels. Digital Humanities Quarterly 11(2) (2017)
Labex OBVIL, Université Pierre et Marie Curie Paris 6, Paris, France
http://tadirah.dariah.eu/vocab/index.php?tema=31&/stylistic-analysis
4
Simone Reborasimone.rebora81@gmail.comStylo 0.6.8R libraryEder, M., Rybicki, J. and Kestemont, M. (2016). Stylometry with R: a package for computational text analysis. R Journal, 8(1): 107-121, url: https://journal.r-project.org/archive/2016/RJ-2016-007/index.html
Attribution to Robert Musil of a series of articles published in the journal "Tiroler Soldaten-Zeitung"
Rebora, Simone, J. Berenike Herrmann, Massimo Salgaro, and Gerhard Lauer. 2018. “Robert Musil, a War Journal, and Stylometry: Tackling the Issue of Short Texts in Authorship Attribution”. Digital Scholarship in the Humanities, [in press]. https://doi.org/10.1093/llc/fqy055University of Verona, Verona, Italy; University of Basel, Basel, SwitzerlandThe tool was perfectly fit for our research (for both validation and experiments) and it was easily integrated in a series of ad-hoc scripts. Only (minor) shortfalls: it imposed a series of operations not immediately useful for our goal, thus increasing computation time; we had to slightly modify the "oppose" code in order to make all the most preferred/avoided words appear into the graph.
http://tadirah.dariah.eu/vocab/index.php?tema=31&/stylistic-analysis
5
Berenike Herrmann juliaberenike@gmail.comkoRpus 0.10-2R libraryMichalke, Meik: koRpus: An R Package for Text Analysis (Version 0.10-2), 2017. URL: https://reaktanz.de/?c=hacking&s=koRpus Used koRpus to measure text fragments for readability, focusing on Flesch index. Readability scores were correlated with other style markers (POS and relation to metaphor) to approximate a combined measure of "vividness" of the prose.Herrmann, J. B. (accepted). Operationalisierung der Metapher zur quantifizierenden Untersuchung deutschsprachiger literarischer Texte im Übergang von Realismus zur Moderne. In Jannidis, Fotis (Ed.), Tagungsband des DFG-Symposiums „Digitale Literaturwissenschaft”, Villa Vigoni, De Gruyter.University of Basel, Basel, Switzerland; University of Göttingen, Göttingen, Germanyhttp://tadirah.dariah.eu/vocab/index.php?tema=31&/stylistic-analysis; http://tadirah.dariah.eu/vocab/index.php?tema=30&/structural-analysis
6
Nanette Rißler-Pipkananette.rissler@gmail.comStylo 0.6.5R libraryEder, M., Rybicki, J. and Kestemont, M. (2016). Stylometry with R: a package for computational text analysis. R Journal, 8(1): 107-121, url: https://journal.r-project.org/archive/2016/RJ-2016-007/index.html
Testing several candidates for the authorship of the second volume of the "Quijote", published under the pseudonym Fernández de Avellaneda and discussing the differences in style between the "Quijote II" by Cervantes and the apocryph version by Avellaneda. Testing cluster analysis and rolling delta.Nanette Rißler-Pipka: Die Digitalisierung des goldenen Zeitalters – Editionsproblematik und stilometrische Autorschaftsattribution am Beispiel des Quijote. In: Zeitschrift für digitale Geisteswissenschaften. Wolfenbüttel 2018. text/html Format. DOI: 10.17175/2018_004Karlsruhe Institute of Technology, Karlsruhe, Germany; University of Siegen, Siegen, GermanyPerfect to proof unliable methods in authorship attribution wrong. Not convincing enough for the community of "cervantistas".http://tadirah.dariah.eu/vocab/index.php?tema=31&/stylistic-analysis
7
Octave Julienfirstname.lastname@univ-paris1.frTXMDesktop or Web application
textometrie.ens-lyon.fr / Heiden, S., Magué, J-P., Pincemin, B. (2010). TXM : Une plateforme
logicielle open-source pour la textométrie – conception et
développement. In I. C. Sergio Bolasco (Ed.), Proc. of 10th International Conference on the Statistical Analysis of Textual Data - JADT 2010) (Vol. 2, p. 1021-1032). Edizioni Universitarie di Lettere Economia Diritto, Roma, Italy. Online. / Heiden, S. (2010). The TXM Platform : Building Open-Source Textual
Analysis Software Compatible with the TEI Encoding Scheme. In K. I. Ryo
Otoguro (Ed.), 24th Pacific Asia Conference on Language, Information and Computation (p. 389-398). Institute for Digital Enhancement of Cognitive Development, Waseda University.
PIREH, Université Paris 1 Panhéon Sorbonne
Multipurpose, opensource text analysis software, integrates TreeTagger for lemmatisation, supports CQL queries, and multiple common formats for corpora encoding.
8
Octave Julienfirstname.lastname@univ-paris1.frIramuteqDesktop applicationhttp://www.iramuteq.org/PIREH, Université Paris 1 Panhéon SorbonneText analysis software, useful for its implementation of the Reinert/Alceste method (classification of text segments) and cooccurences analysis. Graphical interface, based on R, also produces R outputs.
9
Octave Julienfirstname.lastname@univ-paris1.frLexico3 (beta of version 5 available)Desktop applicationhttp://www.lexi-co.com/index.htmlPIREH, Université Paris 1 Panhéon SorbonneMultipurpose text analysis software. Can deal easily with verly large corpora (millions of words). Implements specific and powerful tools for the analysis of chronological evolutions within a corpus, of syntagms, and of patterns of repetitions of a word or syntagm within a corpus
10
Dominique Legalloisdominique.legallois@sorbonne-nouvelle.frQuanteda, tidytext, TM, R libraryKenneth Benoit, julia Silgetextometry
11
Dominique Legalloisdominique.legallois@sorbonne-nouvelle.frSDMChttps://tal.lipn.univ-paris13.fr/sdmc/extraction of syntactic patterns
12
Jan Rybickijkrybicki@gmail.comDocuscopeDesktop applicationhttps://www.cmu.edu/dietrich/english/research/docuscope.htmlrhetorical analysisJonathan Hope, Michael Witmore (2014). "Quantification and the language of later Shakespeare," Actes des congrès de la Société française Shakespeare. 123-149. doi: 10.4000/shakespeare.2830. Strathclyde U., UKsuite of interactive visualization tools for corpus-based rhetorical analysis
13
Jan Rybickijkrybicki@gmail.comTRACERDesktop applicationhttps://www.etrap.eu/research/tracer/analysis of intertextuality, versioning, comparing translationsFranzini, G. (2016) ‘English translations of Pan Tadeusz: a comparison with TRACER‘, Corpus-based Research in the Humanities workshop. January, 19. Online.University of Göttingen, GermanyTRACER is a suite of 700 algorithms, whose features can be combined to create the optimal formula for detecting those words, sentences and ideas that have been reused across texts. Created by Marco Büchler, TRACER is designed to facilitate research in text reuse detection and many have made use of it to identify plagiarism in a text, as well as verbatim and near verbatim quotations, paraphrase and even allusions. The thousands of feature combinations that TRACER supports allow to investigate not only contemporary texts, but also complex historical texts where reuse is harder to spot.
14
Jan Rybickijkrybicki@gmail.comWCopyFindDesktop Applicationhttp://plagiarism.bloomfieldmedia.com/wordpress/software/wcopyfind/plagiarism detection, common word n-gram detectionAnna FIlipek (2014). „Pan Tadeusz”, or Translating the Untranslatable: An Analysis of English Translations, M.A. Thesis. Kraków: Uniwersytet JagiellońskiJagiellonian University, Kraków, PolandWCopyfind is an open source windows-based program that compares documents and reports similarities in their words and phrases.
15
Allen Riddellriddella@indiana.edulxmlPython packagehttps://pypi.org/project/lxml/Loading dataIndiana University Bloomington, Bloomington, USANone; it's a mature, well-tested Python package.
16
Allen Riddellriddella@indiana.edumatplotlibPython packagehttps://pypi.org/project/matplotlibPlottingIndiana University Bloomington, Bloomington, USANone; it's a mature, well-tested Python package.
17
Allen Riddellriddella@indiana.edunltkPython packagehttps://pypi.org/project/nltkText analysisIndiana University Bloomington, Bloomington, USANone; it's a mature, well-tested Python package.
18
Allen Riddellriddella@indiana.edunumpyPython packagehttps://pypi.org/project/numpyText analysisIndiana University Bloomington, Bloomington, USANone; it's a mature, well-tested Python package.
19
Allen Riddellriddella@indiana.eduscipyPython packagehttps://pypi.org/project/scipyText analysisIndiana University Bloomington, Bloomington, USANone; it's a mature, well-tested Python package.
20
Allen Riddellriddella@indiana.edupandasPython packagehttps://pypi.org/project/pandasLoading data, summarizing dataIndiana University Bloomington, Bloomington, USANone; it's a mature, well-tested Python package.
21
Allen Riddellriddella@indiana.edupystanPython packagehttps://pypi.org/project/pystanAnalyzing dataIndiana University Bloomington, Bloomington, USANone; it's a mature, well-tested Python package.
22
Allen Riddellriddella@indiana.eduscikit-learnPython packagehttps://pypi.org/project/scikit-learnAnalyzing data, making predictionsIndiana University Bloomington, Bloomington, USANone; it's a mature, well-tested Python package.
23
Allen Riddellriddella@indiana.edustatsmodelsPython packagehttps://pypi.org/project/statsmodelsAnalyzing data, making predictionsIndiana University Bloomington, Bloomington, USANone; it's a mature, well-tested Python package.
24
Allen Riddellriddella@indiana.edupytorchPython packagehttps://pytorch.org/Analyzing data, making predictionsIndiana University Bloomington, Bloomington, USAEasier to use than Tensorflow for building language models of text.
25
Allen Riddellriddella@indiana.educartopyPython packagehttps://pypi.org/project/cartopyMaking mapsIndiana University Bloomington, Bloomington, USANone; it's a mature, well-tested Python package.
26
Jonathan Reevejonathan.reeve@columbia.edutext-matcherPython packagehttps://pypi.org/project/text-matcher/Text reuse detection, plagiarism detectionJonathan Reeve, Milan Terlunen, and Sierra Eckert. "Middlemarch Critical Histories." [Forthcoming]Columbia University, New York City, USANeeds test suite, better documentation.
27
Jonathan Reevejonathan.reeve@columbia.edumacro-etymPython packagehttps://github.com/JonathanReeve/macro-etymMacro-etymological text analysisReeve, Jonathan. "A macro-etymological analysis of James Joyce’s A Portrait of the Artist as a Young Man." Reading Modernism with Machines. Palgrave Macmillan, London, 2016. 203-222.Columbia University, New York City, USANeeds test suite, some bugfixes, better packaging for data in PyPi.
28
Jonathan Reevejonathan.reeve@columbia.educhapterizePython packagehttps://pypi.org/project/chapterize/Text segmentationColumbia University, New York City, USANeeds a less deterministic approach to chapter detection
29
Jonathan Reevejonathan.reeve@columbia.eduspacyPython packagehttps://spacy.io/Natural language processingLanguage models are often difficult to install
30
Jonathan Reevejonathan.reeve@columbia.edutextacyPython packagehttps://pypi.org/project/textacy/Natural language processingSome open bugs (see GitHub issues)
31
Fotis Jannidisfotis@jannidis.degensimPython packagehttps://radimrehurek.com/gensim/Natural language processing
32
Fotis Jannidisfotis@jannidis.despacyPython packagehttps://spacy.io/Natural language processing
33
Fotis Jannidisfotis@jannidis.deumap-learnPython packagehttps://github.com/lmcinnes/umapDimensionality reduction
34
Fotis Jannidisfotis@jannidis.deseabornPython packagehttps://seaborn.pydata.org/Visualization
35
Fotis Jannidisfotis@jannidis.dekerasPython packagehttps://keras.io/Deep Learning Framework
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100