Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
In a Training Loop 🔄
1781
322
151
Stefan Schweter
PRO
stefan-it
Follow
DrLink's profile picture
BLKWDS's profile picture
Seeknhide90's profile picture
3,664 followers
·
380 following
https://schweter.bayern
stefan-it
stefan-it
AI & ML interests
Flair Library 💕, NER & PoS Tagging, LM Pretraining (mostly encoder-only & encoder-decoder), Historical Language Models, German Language Models, Bavarian NLP 🥨
Recent Activity
upvoted
a
paper
about 17 hours ago
LoRA-Squeeze: Simple and Effective Post-Tuning and In-Tuning Compression of LoRA Modules
upvoted
a
paper
about 18 hours ago
Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning
commented
on
a paper
about 20 hours ago
Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning
View all activity
Organizations
stefan-it
's models
1,344
Sort: Recently updated
stefan-it/xlstm-transformers-bug-triton
Updated
Nov 8, 2025
•
2
stefan-it/xlstm-transformers-bug-native
Updated
Nov 8, 2025
•
4
stefan-it/nanochat-german-v1
0.6B
•
Updated
Oct 28, 2025
•
6
•
1
stefan-it/nanochat-german-base-checkpoint
Updated
Oct 25, 2025
stefan-it/nanochat-german-base
0.6B
•
Updated
Oct 24, 2025
•
6
stefan-it/nanochat-german-tokenizer
Updated
Oct 24, 2025
•
1
stefan-it/ettin-encoder-400m-tokenizer-fix
Fill-Mask
•
0.4B
•
Updated
Jul 20, 2025
•
2
stefan-it/flair-ettin-400m-ner-conll03
Updated
Jul 17, 2025
stefan-it/ModernBERT-large-tokenizer-fix
Fill-Mask
•
0.4B
•
Updated
Jul 16, 2025
•
14
•
2
stefan-it/flair-modernbert-large-ner-conll03
Updated
May 9, 2025
stefan-it/bert5urk
1B
•
Updated
Mar 3, 2025
•
23
•
13
stefan-it/neobert-ner-conll03
0.2B
•
Updated
Mar 2, 2025
•
5
•
1
stefan-it/electra-base-gc4-64k-0-cased-discriminator
0.1B
•
Updated
Mar 1, 2025
•
8
•
1
stefan-it/electra-base-gc4-64k-100000-cased-discriminator
0.1B
•
Updated
Mar 1, 2025
•
5
stefan-it/electra-base-gc4-64k-200000-cased-discriminator
0.1B
•
Updated
Mar 1, 2025
•
4
stefan-it/electra-base-gc4-64k-300000-cased-discriminator
0.1B
•
Updated
Mar 1, 2025
•
11
stefan-it/electra-base-gc4-64k-400000-cased-discriminator
0.1B
•
Updated
Mar 1, 2025
•
6
stefan-it/electra-base-gc4-64k-500000-cased-discriminator
0.1B
•
Updated
Mar 1, 2025
•
5
stefan-it/electra-base-gc4-64k-600000-cased-discriminator
0.1B
•
Updated
Mar 1, 2025
•
7
stefan-it/electra-base-gc4-64k-700000-cased-discriminator
0.1B
•
Updated
Mar 1, 2025
•
7
stefan-it/electra-base-gc4-64k-800000-cased-discriminator
0.1B
•
Updated
Mar 1, 2025
•
6
stefan-it/electra-base-gc4-64k-900000-cased-discriminator
0.1B
•
Updated
Mar 1, 2025
•
7
stefan-it/electra-base-gc4-64k-1000000-cased-discriminator
0.1B
•
Updated
Mar 1, 2025
•
3
stefan-it/electra-base-gc4-64k-300000-cased-generator
Fill-Mask
•
59.5M
•
Updated
Mar 1, 2025
•
7
stefan-it/electra-base-gc4-64k-400000-cased-generator
Fill-Mask
•
59.5M
•
Updated
Mar 1, 2025
•
4
stefan-it/electra-base-gc4-64k-500000-cased-generator
Fill-Mask
•
59.5M
•
Updated
Mar 1, 2025
•
3
stefan-it/electra-base-gc4-64k-600000-cased-generator
Fill-Mask
•
59.5M
•
Updated
Mar 1, 2025
•
5
stefan-it/electra-base-gc4-64k-700000-cased-generator
Fill-Mask
•
59.5M
•
Updated
Mar 1, 2025
•
5
stefan-it/electra-base-gc4-64k-800000-cased-generator
Fill-Mask
•
59.5M
•
Updated
Mar 1, 2025
•
3
stefan-it/electra-base-gc4-64k-900000-cased-generator
Fill-Mask
•
59.5M
•
Updated
Mar 1, 2025
•
4
Previous
1
2
3
...
45
Next