Multilingual Classifier for Automatic Dewey Decimal Classification trained on Open Access Linguistics Abstracts
Thüringer Universitäts- und Landesbibliothek · Universitätsbibliothek Johann Christian Senckenberg
Abstract
The baseline model featured in the conference paper for The 28th International Conference on Theory and Practice of Digital Libraries titled "Towards Multilingual LLM-based Approaches for Automatic Dewey Decimal Classification". The model is fine-tuned from the pretrained model "sentence-transformers/multi-qa-MiniLM-L6-cos-v1" (https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1), trained on open access Linguistic texts with metadata, assigned a DDC class between 400 and 499. This version is trained on the "description" metadata. The model is trained with the trainer on Huggingface. It can be run locally as other huggingface models.
Citation impact
- FWCI
- —
- Percentile
- —
- References
- 0
Authors
1- HCHo, Clara Wan ChingCorresponding
Thüringer Universitäts- und Landesbibliothek, Universitätsbibliothek Johann Christian Senckenberg
Topics & keywords
- Sentence
- Computer science
- Natural language processing
- Artificial intelligence
- Inference
- Similarity (geometry)
- Cosine similarity
- Set (abstract data type)