AI & ML interests

None defined yet.

Recent Activity

Ujjwal-Tyagi 
posted an update 10 days ago
Ujjwal-Tyagi 
posted an update 16 days ago
view post
Post
1793
There is a new open-source music generation model called HeartMuLa. It offers strong, competitive performance compared to Suno and supports English, Chinese, Japanese, Korean, and Spanish. It is optimized to run easily on RTX GPUs and other consumer-grade hardware. HeartMuLa/HeartMuLa-oss-3B
https://github.com/HeartMuLa/heartlib
  • 1 reply
·
Ujjwal-Tyagi 
posted an update 19 days ago
view post
Post
2776
So, Koreans are also doing great progress behind Chinese,
Their two open source ai models that are actually good in coding. upstage/Solar-Open-100B skt/A.X-K1
  • 1 reply
·
Ujjwal-Tyagi 
posted an update 20 days ago
Ujjwal-Tyagi 
posted an update 25 days ago
view post
Post
2590
I am very excited to see the release of nyuuzyou/gitee-code. This is exactly what I have been looking for. Thank you to @nyuuzyou for his hard work on this.
·
Ujjwal-Tyagi 
posted an update 26 days ago
view post
Post
2302
I’m looking for AI engineers and researchers to join my company as part of the core team. We’ll be working on cutting-edge research and hands-on implementation across LLMs and related systems. I’m especially interested in founding engineers for my ai startup, who want to build from the ground up and shape both the product and the research direction. If this sounds interesting to you, reply to this post and message me on Discord — my username is "ujjwal_tyagi.shirova", Please also attach your Resume and Details of your open source projects (if any related to LLMs) on discord, avoid sharing here as a reply to this post.
Ujjwal-Tyagi 
posted an update 29 days ago
view post
Post
2687
For more better details and analysis, you can read the article here: https://huggingface.co/blog/Ujjwal-Tyagi/steering-not-censoring, We are sleepwalking into a crisis. I am deeply concerned about AI model safety right now because, as the community rushes to roll out increasingly powerful open-source models, we are completely neglecting the most critical aspect: safety. It seems that nobody is seriously thinking about the potential consequences of unregulated model outputs or the necessity of robust guardrails. We are essentially planting the seeds for our own destruction if we prioritize raw performance over security.

This negligence is terrifyingly evident when you look at the current landscape. Take Qwen Image 2512, for example; while it delivers undeniably strong performance, it has incredibly weak guardrails that make it dangerous to deploy. In stark contrast, Z Image might not get as much hype for its power, but it has much better safety guardrails than Qwen Image 2512.

It is imperative that the open-source community and developers recognize that capability without responsibility is a liability. We must actively work on protecting these models from bad actors who seek to exploit them for malicious purposes, such as generating disinformation, creating non-consensual imagery, or automating cyberattacks. It is no longer enough to simply release a powerful model; we must build layers of defense that make it resistant to jailbreaking and adversarial attacks. Developers need to prioritize alignment and robust filtering techniques just as much as they prioritize benchmark scores. We cannot hand such potent tools to the world without ensuring they have the safety mechanisms to prevent them from being turned against us.
·
KingNish 
posted an update about 2 months ago
view post
Post
2754
Muon vs MuonClip vs Muon+Adamw

Muon has gone from an experiment to a mainstream optimizer, but does it hold up for fine‑tuning? We ran head‑to‑head tests on Qwen3‑4B (10k+ high‑quality instruction rows) to find out.

Short story: Pure Muon converged fastest at the start, but its gradient‑norm spikes made training unstable. MuonClip (Kimi K2’s clipping) stabilizes long pretraining runs, yet in our small‑scale fine‑tune it underperformed, lower token accuracy and slower convergence. The winner was the hybrid: Muon for 2D layers + AdamW for 1D layers. It delivered the best balance of stability and final performance and even beat vanilla AdamW.

Takeaway: for small-scale fine-tuning, hybrid = practical and reliable.

Next Step: scale to larger models/datasets to see if Muon’s spikes become catastrophic or if clipping wins out.

Full Blog Link: https://huggingface.co/blog/KingNish/optimizer-part1
KingNish 
posted an update 2 months ago
mrfakename 
posted an update 2 months ago
view post
Post
13306
Excited to share that I've joined the Hugging Face Fellows program! 🤗

Looking forward to contributing to & working more closely with the open-source ecosystem - huge thanks to everyone who's supported me on this journey! 🚀
adamm-hf 
posted an update 3 months ago
adamm-hf 
posted an update 3 months ago
adamm-hf 
posted an update 3 months ago
view post
Post
2809
💸🤑You don’t need 100 GPUs to train something amazing!

Our Smol Training Playbook teaches you a better path to world-class LLMs, for free!

Check out the #1 trending space on 🤗 :
HuggingFaceTB/smol-training-playbook
mrfakename 
posted an update 3 months ago
view post
Post
6232
Trained a model for emotion-controllable TTS based on MiMo audio on LAION's dataset.

Still very early and does have an issue with hallucinating but results seem pretty good so far, given that it is very early into the training run.

Will probably kick off a new run later with some settings tweaked.

Put up a demo here: https://huggingface.co/spaces/mrfakename/EmoAct-MiMo

(Turn 🔊 on to hear audio samples)
·
adamm-hf 
posted an update 4 months ago
view post
Post
2314
Cool stuff these past weeks on huggingface! 🤗 🚀 !
• 📈Trackio, local-first W&B alternative
https://github.com/gradio-app/trackio/issues
• 🌍EmbeddingGemma, 300M-param, multilingual embeddings, on-device
https://huggingface.co/blog/embeddinggemma
• 💻Open LLMs in VS Code (Inference Providers)
https://x.com/reach_vb/status/1966185427582497171
• 🤖Smol2Operator GUI agents
https://huggingface.co/blog/smol2operator
• 🖼️Gradio visible watermarking
https://huggingface.co/blog/watermarking-with-gradio
hannayukhymenko 
posted an update 5 months ago
view post
Post
3039
Releasing the Jupyter Agent Dataset! 🚀

Built from 7 TB of real Kaggle datasets + 20k notebooks, creating real code exec traces using Qwen3-Coder and E2B.
Training on this data dramatically improves the ability to execute code and analyze data.

We ( @baptistecolle @hannayukhymenko @lvwerra ) have created a novel synthetic data generation pipeline with efficient scaffolding, which gives a big performance boost after training your coding agent🔥With the help of real Kaggle notebooks and datasets we generate synthetic notebooks which aim to analyze datasets and answer factual questions about them more efficiently. We simulate a real code execution environment by prompting LLMs or with the help of E2B sandboxes. We have built a dataset of 50k+ high-quality LLM-generated notebooks which can help your agent become better at performing data analysis and question answering.

Link: https://huggingface.co/datasets/data-agents/jupyter-agent-dataset
  • 3 replies
·
jeffboudier 
posted an update 6 months ago
view post
Post
3170
Quick 30s demo of the new Hub > Azure AI integration to deploy HF models in your own Azure account. Now with Py and CLI!

GG @alvarobartt @kramp @pagezyhf
KingNish 
posted an update 6 months ago
jeffboudier 
posted an update 8 months ago
view post
Post
572
AMD summer hackathons are here!
A chance to get hands-on with MI300X GPUs and accelerate models.
🇫🇷 Paris - Station F - July 5-6
🇮🇳 Mumbai - July 12-13
🇮🇳 Bengaluru - July 19-20

Hugging Face and GPU Mode will be on site and on July 6 in Paris @ror will share lessons learned while building new kernels to accelerate Llama 3.1 405B on ROCm

Register to Paris event: https://lu.ma/fmvdjmur?tk=KeAbiP
All dates: https://lu.ma/calendar/cal-3sxhD5FdxWsMDIz
jeffboudier 
posted an update 8 months ago
view post
Post
1737
Today we launched Training Cluster as a Service, to make the new DGX Cloud Lepton supercloud easily accessible to AI researchers.

Hugging Face will collaborate with NVIDIA to provision and set up GPU training clusters to make them available for the duration of training runs.

Hugging Face organizations can sign up here: https://huggingface.co/training-cluster