preprintRepositorio Institucional UPTCJul 18, 2024GREEN OA

Multilingual Classifier for Automatic Dewey Decimal Classification trained on Open Access Linguistics Abstracts

HCHo, Clara Wan Ching

Thüringer Universitäts- und Landesbibliothek · Universitätsbibliothek Johann Christian Senckenberg

Indexed indatacite

Abstract

The baseline model featured in the conference paper for The 28th International Conference on Theory and Practice of Digital Libraries titled "Towards Multilingual LLM-based Approaches for Automatic Dewey Decimal Classification". The model is fine-tuned from the pretrained model "sentence-transformers/multi-qa-MiniLM-L6-cos-v1" (https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1), trained on open access Linguistic texts with metadata, assigned a DDC class between 400 and 499. This version is trained on the "description" metadata. The model is trained with the trainer on Huggingface. It can be run locally as other huggingface models.

Citation impact

332
total citations
FWCI
Percentile
References
0
Citations per year

Authors

1
  • HC
    Ho, Clara Wan ChingCorresponding

    Thüringer Universitäts- und Landesbibliothek, Universitätsbibliothek Johann Christian Senckenberg

Topics & keywords

Keywords
  • Sentence
  • Computer science
  • Natural language processing
  • Artificial intelligence
  • Inference
  • Similarity (geometry)
  • Cosine similarity
  • Set (abstract data type)
No related works found for this paper.