Stable Genius
Regular
RT is a fan of Hugging Face
and Edge impulse is excited and commented on this this as well. Looks like itās another software technology advancement to help with the implementation of our transformers to be released soon:
www.linkedin.com
We just released Transformers' boldest feature: Transformers Agents.
This removes the barrier of entry to machine learning: it offers more than 100,000 HF models to be controlled with natural language.
This is a truly fully multimodal agent: text, images, video, audio, docs, and way more to come.
Read the documentation here: https://lnkd.in/eAmxV_zi
What is it?
Create an agent using LLMs (OpenAssistant, StarCoder, OpenAI ...) and start talking to transformers and diffusers through curated tools.
How does it work?
It's straightforward prompt-building:
ā¢ Tell the agent what it aims to do
ā¢ Give it tools
ā¢ Show examples
ā¢ Give it a task
The agent uses chain-of-thought reasoning to identify its task and outputs Python code using the tools. It comes with a myriad of built-in tools such as document, text, image QA, Speech-to-text and text-to-speech, text classification, summarization, translation, image edition tools, text-to-video...
But it is EXTENSIBLE by design.
Tools are elementary: a name, a description, a function. Designing a tool and pushing it to the Hub can be done in a few lines of code.
The toolkit of the agent serves as a base: extend it with your tools, or with other community-contributed tools:
Please play with it, add your tools, and let's create super-powerful agents together.
Here's a notebook to get started: https://lnkd.in/eYsh9eqG


Edge Impulse on LinkedIn: Exciting news! We can't wait to try it out to help speed up datasetā¦
Exciting news! We can't wait to try it out to help speed up dataset creation for edge AI ā many uses for augmentation, auto-labelling, and data synthesis! Inā¦

This removes the barrier of entry to machine learning: it offers more than 100,000 HF models to be controlled with natural language.
This is a truly fully multimodal agent: text, images, video, audio, docs, and way more to come.
Read the documentation here: https://lnkd.in/eAmxV_zi
What is it?

How does it work?
It's straightforward prompt-building:
ā¢ Tell the agent what it aims to do
ā¢ Give it tools
ā¢ Show examples
ā¢ Give it a task
The agent uses chain-of-thought reasoning to identify its task and outputs Python code using the tools. It comes with a myriad of built-in tools such as document, text, image QA, Speech-to-text and text-to-speech, text classification, summarization, translation, image edition tools, text-to-video...

Tools are elementary: a name, a description, a function. Designing a tool and pushing it to the Hub can be done in a few lines of code.
The toolkit of the agent serves as a base: extend it with your tools, or with other community-contributed tools:
Please play with it, add your tools, and let's create super-powerful agents together.
Here's a notebook to get started: https://lnkd.in/eYsh9eqG
