bnew

Veteran
Joined
Nov 1, 2015
Messages
53,001
Reputation
8,005
Daps
151,292

David Attenborough narrates your life




 
Last edited:

Micky Mikey

Veteran
Supporter
Joined
Sep 27, 2013
Messages
15,433
Reputation
2,749
Daps
85,403

bnew

Veteran
Joined
Nov 1, 2015
Messages
53,001
Reputation
8,005
Daps
151,292

Data labeling companies are raising prices in the AI boom​


AI chatbots require a lot of high-quality data—and now it's costing more​

By

Michelle Cheng
PublishedNovember 9, 2023

An undated handout image from U.S. startup Replika shows a user interacting with a smartphone app to customize an avatar for a personal artificial intelligence chatbot, known as a Replika, in San Francisco, California, U.S.

Avatars are still clunky-looking.Photo: Luka, Inc./Handout via Reuters (Reuters)

Need another indicator that the generative artificial intelligence industry is real and making progress? Look at the booming business of data labeling and annotation, which is an essential step in training the models that power AI products ranging from what’s currently vogue in the industry—chatbots!— to ongoing projects such as self-driving vehicles and tools that diagnose diseases.

During the data labeling step, usually, a team of humans will identify data points, whether that’s the severity of the damage in 100,000 photos of different cars for an insurance company, or the sentiments of people who interact with support agents for a customer service company. Data annotation is a critical step in the training of large-language models (LLMs) like OpenAI’s GPT because it makes AI models more accurate.

Following OpenAI’s release of ChatGPT last November, data annotation companies have received so much demand that it is pushing some of them to raise prices.

Realeyes is a company based in London that uses computer vision to read and understand human behavior; that data is then used to improve advertising effectiveness or to minimize identity fraud. Since the company was collecting and labeling data for its own computer vision algorithms, the company decided two years ago to move into an analogous service of data labeling for other companies, said Mihkel Jäätma, the CEO of Realeyes, which works with over 200 companies across media, technology, and advertising.

The data labeling service began generating revenue last year, with the business getting “very big, very quickly,” he said. Jäätma estimates that 80% of the business comes from companies essentially looking to make avatars less cartoonish. “It’s really kind of exploded to be a very substantial part of our business only in the last two years and keeps going that way,” he said.

From the likes of big tech companies and well-funded AI startups, “[t]he investment that we see is that this is going to be overlaid with very human-like [features],” he said. In other words, the work now is to make these avatars—bots that exhibit personalities based on made-up characters or real people—understand users and talk in a more human way.

Since the launch of its data labeling service, Realeyes has raised prices at least twice. Jäätma said he has had to tell customers that if they weren’t willing to pay up, Realeyes would not complete the full request.


Making avatars more human-like​

Labeling audio and visual recording is complex. It’s not just data scrapped from the Internet. Human annotators work on assessing people’s emotions, for example—and as that work gets more nuanced, it means paying the annotators more. (Realeyes was reportedly hired by Meta to make the tech giant’s avatars, which rolled out its own AI avatars in September, more human.)

Meanwhile, Snorkel AI, a company specializing in data labeling, said that the number of inquiries it received in the past three months was more than five times the total number received in the entire previous year, with requests coming from early-stage startups building large-language models (LLMs), as well as government agencies and IT companies.

The Redwood City, California-based company has not raised prices, but it has rolled out additional service offerings around AI training since customers’ needs have diversified.


Data labeling is already a $2.2 billion industry​

The growth in data labeling shows that generative AI applications are making progress. “With ChatGPT and other developments, the applications of AI are not out of reach,” said Devang Sachdev, vice president of marketing at Snorkel AI. The surge in AI products comes as LLMs from the likes of Google and OpenAI have also become much more accessible.

The global data collection and labeling market hit $2.2 billion in 2022 and it is expected to grow nearly 30% from 2023 to 2030, according to market research firm Grand View Research.
 

Trav

Marathon Mentality 🏁
Supporter
Joined
May 26, 2012
Messages
27,662
Reputation
5,010
Daps
78,418
Reppin
TMC 8-24
I wonder if Satya put the hit out behind closed doors now that Microsoft was officially on board :patrice:
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
53,001
Reputation
8,005
Daps
151,292

Meta launches AI-based video editing tools​

Reuters

November 16, 20236:53 PM ESTUpdated 2 days ago

FILE PHOTO: Illustration shows Meta AI logo

Meta AI logo is seen in this illustration taken September 28, 2023. REUTERS/Dado Ruvic/Illustration/File Photo Acquire Licensing Rights

Nov 16 (Reuters) - Meta Platforms (META.O) on Thursday launched two new AI-based features for video editing that could be used for posting to Instagram or Facebook.

The first is called Emu Video and it generates four-second long videos with a prompt of a caption, photo or an image, paired with a description. The other is known as Emu Edit that allows users to more easily alter or edit videos with text prompts.

The new tools are an advancement of the parent model Emu that generates images in response to text prompts.

Emu underpins a generative AI technology and some AI image editing tools for Instagram that lets one take a photo and change its visual style or background.

Businesses and enterprises in the last one year have flocked to the nascent generative AI market for newer capabilities and refining business processes since the launch of OpenAI's ChatGPT late last year.

The social media giant has been making rapid strides in the AI universe and has become one of its most significant focus point as it looks to compete with other giants such as Microsoft (MSFT.O), Alphabet's Google (GOOGL.O) and Amazon (AMZN.O).

Reporting by Priyamvada C in Bengaluru; Editing by Rashmi Aich
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
53,001
Reputation
8,005
Daps
151,292


Discord is shutting down its AI chatbot Clyde​



Discord users won’t be able to chat to Clyde from December 1st onwards.​

By Tom Warren, a senior editor covering Microsoft, PC gaming, console, and tech. He founded WinRumors, a site dedicated to Microsoft news, before joining The Verge in 2012.

Nov 17, 2023, 5:47 AM EST|9 Comments / 9 New

Illustration of Discord logo

Illustration: Alex Castro / The Verge

Discord is shutting down Clyde, its experimental AI chatbot. In a support note, Discord says the chatbot will be “deactivated” at the end of the month, and that by December 1st “users will no longer be able to invoke Clyde in DMs, Group DMs or server chats.”

Discord first started testing Clyde’s AI features earlier this year, using OpenAI’s models to let the chatbot answer questions and have conversations with Discord users. It has been in limited testing ever since, and the company had planned to make it a fundamental part of its chat and communities app.

The Clyde AI chatbot.

The Clyde AI chatbot. Image: Discord

It’s not clear why Clyde is suddenly shutting down. It’s possible the chatbot may return as a paid Nitro-only feature in the future, or perhaps Discord has learned enough from its testing period and decided an AI chatbot doesn’t need to be baked into its service. Discord isn’t saying exactly why it’s shutting down, though.

“Clyde is an experiment shared with a small percentage of servers,” says Kellyn Slone, director of product communications at Discord, in a statement to The Verge. “Discord is constantly working on bringing users new features and experiences. Clyde is one iteration of this work, and we look forward to unveiling new user experiences in the future.”

Discord has been experimenting with a variety of AI features, including AI-generated conversation summaries. This allows Discord users to catch up on conversations they might have missed, particularly useful for servers that span across multiple time zones. Discord has also been trying to position its platform as home for AI developers, with funds and resources available to help developers build AI apps for Discord.

Update, November 17th 12PM ET: Article updated with a comment from Discord.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
53,001
Reputation
8,005
Daps
151,292




Introducing Alfred-40B-1023:​

Pioneering the Future of Open-Source Language Model from LightOn
We are thrilled to unveil Alfred-40B-1023, the latest iteration of our celebrated open-source Language Model. Building on the solid foundation of its predecessor, Alfred-40B-1023 represents a significant leap forward, catering specifically to the needs of enterprises.

What's New in Alfred-40B-1023?

Alfred-40B-1023 boasts a range of enhancements and new features, including:
  • Reduced Hallucinations: One of the standout features of Alfred-40B-1023 is its refined ability to minimize hallucinations, ensuring more accurate and reliable outputs.
  • Enhanced Self-Awareness: In situations where the model lacks a definitive answer, Alfred-40B-1023 is now programmed to state, "I don't know", enhancing its transparency and trustworthiness.
  • Superior 'Chat with Docs' Capability: Alfred-40B-1023 is trained to perform 'Chat with Docs' tasks like no other, streamlining document interaction and information retrieval.
  • Expanded Context: With an increased context of 8K tokens, Alfred-40B-1023 can comprehend and generate longer and more intricate content, ensuring detailed and comprehensive responses.

Continuing the Legacy of Its Predecessor​

Much like its predecessor, which was recognized for its prompt engineering, no-code application development, and classic LLM tasks, Alfred-40B-1023 is poised to set new benchmarks. The Falcon model continues to be the backbone, with Alfred-40B-1023 refining its capabilities to serve as an even more effective Generative AI copilot.

More than Just a Model – Introducing Paradigm:​

Paradigm is not just a platform; it's a conviction. We firmly believe that the future of AI in enterprises and governments lies not just in models but in a robust platform equipped with tools to deploy this groundbreaking technology seamlessly.

Commitment to Open Source​

In line with LightOn's dedication to promoting progress in the field, Alfred-40B-1023 is offered as an open-source model. While we persistently enhance the model, the open-source version might differ from the one on the Paradigm platform, ensuring that Paradigm users always have access to the most advanced version.

Training and Accessibility​

Alfred-40B-1023 continues to benefit from the efficient and reliable infrastructure of Amazon SageMaker for its training. Soon, Alfred-40B-1023 will also be available on platforms like HuggingFace and AWS Jumpstart for Foundation Models, making its integration into diverse workflows even smoother.

Join Us in This Exciting Journey​

We believe in the collective strength of the Generative AI community. With the release of Alfred-40B-1023, we showcase our deep expertise in training personalized models tailored to our clients' needs.

We invite the global community – to join us. Dive into Alfred-40B-1023, contribute, and be a part of this transformative journey. We’re not just offering a model; we're sharing a vision of the future.
Be sure to catch our unveiling of Alfred-40B-1023 during the AI Pulse Keynote.

About LightOn:​

A torchbearer in the Generative AI domain, LightOn is redefining the boundaries of AI capabilities. With a blend of pioneering models and robust platforms like Paradigm, we're guiding enterprises and governments into the next AI frontier.


 

bnew

Veteran
Joined
Nov 1, 2015
Messages
53,001
Reputation
8,005
Daps
151,292



LLaMA LLM: All Versions & Hardware Requirements
Explore all versions of the model, their file formats like GGML, GPTQ, and HF, and understand the hardware requirements for local inference.
Meta has released LLaMA (v1) (Large Language Model Meta AI), a foundational language model designed to assist researchers in the AI field. LLaMA distinguishes itself due to its smaller, more efficient size, making it less resource-intensive than some other large models. These foundation models train on vast amounts of unlabeled data, allowing them to be tailored for a multitude of tasks. LLaMA is available in various sizes, including 7B, 13B, 33B, and 65B parameters.

The model was trained using text from the 20 languages with the highest number of speakers, primarily focusing on those with Latin and Cyrillic scripts. While the model was designed to be versatile for a wide range of applications, challenges like potential biases remain. To manage responsible dissemination, Meta provides LLaMA under a noncommercial license, granting access mainly for research purposes on a case-by-case basis.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
53,001
Reputation
8,005
Daps
151,292

DEEP LEARNING COURSE​

You can find here slides, recordings, and a virtual machine for François Fleuret's deep-learning courses 14x050 of the University of Geneva, Switzerland.

This course is a thorough introduction to deep-learning, with examples in the PyTorch framework:

  • machine learning objectives and main challenges,
  • tensor operations,
  • automatic differentiation, gradient descent,
  • deep-learning specific techniques,
  • generative, recurrent, attention models.
You can check the pre-requisites.

This course was developped initialy at the Idiap Research Institute in 2018, and taught as EE-559 at École Polytechnique Fédérale de Lausanne until 2022. The notes for the handouts were added with the help of Olivier Canévet.

Thanks to Adam Paszke, Jean-Baptiste Cordonnier, Alexandre Nanchen, Xavier Glorot, Andreas Steiner, Matus Telgarsky, Diederik Kingma, Nikolaos Pappas, Soumith Chintala, and Shaojie Bai for their answers or comments.

In addition to the materials available here, I also wrote and distribute "The Little Book of Deep Learning", a phone-formatted short introduction to deep learning for readers with a STEM background.

LECTURE MATERIALS​

The slide pdfs are the ones I use for the lectures. They are in landscape format with overlays to facilitate the presentation. The handout pdfs are compiled without these fancy effects in portrait orientation, with additional notes. The screencasts are available both as in-browser streaming or downloadable mp4 files.

You can get archives with all the pdf files (1097 slides):

and subtitles for the screencasts generated automaticallly with OpenAI's Whisper:

 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
53,001
Reputation
8,005
Daps
151,292

Meta disbanded its Responsible AI team​

/


A new report says Meta’s Responsible AI team is now working on other AI teams.​

By Wes Davis, a weekend editor who covers the latest in tech and entertainment. He has written news, reviews, and more as a tech journalist since 2020.

Nov 18, 2023, 4:24 PM EST|9 Comments / 9 New

Share this story​


Image of Meta’s logo with a red and blue background.

Illustration by Nick Barclay / The Verge

Meta has reportedly broken up its Responsible AI (RAI) team as it puts more of its resources into generative artificial intelligence. The Information broke the news today, citing an internal post it had seen.

According to the report, most RAI members will move to the company’s generative AI product team, while others will work on Meta’s AI infrastructure. The company regularly says it wants to develop AI responsibly and even has a page devoted to the promise, where the company lists its “pillars of responsible AI,” including accountability, transparency, safety, privacy, and more.

The Information’s report quotes Jon Carvill, who represents Meta, as saying that the company will “continue to prioritize and invest in safe and responsible AI development.” He added that although the company is splitting the team up, those members will “continue to support relevant cross-Meta efforts on responsible AI development and use.”

Meta did not respond to a request for comment by press time.

The team already saw a restructuring earlier this year, which Business Insider wrote included layoffs that left RAI “a shell of a team.” That report went on to say the RAI team, which had existed since 2019, had little autonomy and that its initiatives had to go through lengthy stakeholder negotiations before they could be implemented.

RAI was created to identify problems with its AI training approaches, including whether the company’s models are trained with adequately diverse information, with an eye toward preventing things like moderation issues on its platforms. Automated systems on Meta’s social platforms have led to problems like a Facebook translation issue that caused a false arrest, WhatsApp AI sticker generation that results in biased images when given certain prompts, and Instagram’s algorithms helping people find child sexual abuse materials.

Moves like Meta’s and a similar one by Microsoft early this year come as world governments race to create regulatory guardrails for artificial intelligence development. The US government entered into agreements with AI companies and President Biden later directed government agencies to come up with AI safety rules. Meanwhile, the European Union has published its AI principles and is still struggling to pass its AI Act.
 
Top