It’s hard to deny that Artificial Intelligence (AI) Large Language Models, and more specifically, ChatGPT, are taking the world by storm. Just two months after launching, ChatGPT is estimated to have reached 100 million monthly active users.
Detractors of ChatGPT often mention its occasional “hallucinations”—wrong answers given with total confidence by ChatGPT. However, what we should be worried is not about software failures. Large language models are a type of AI system that is trained on massive amounts of data. What criteria use OpenAI, Microsoft, Google, and other tech giants to curate the data used for training their AI models?