All AI models require training data — and most developers get their data unethically. Everyone seems to be doing it: OpenAI was recently hit with multiple lawsuits (more) claiming the company stole internet data to train its popular ChatGPT tool. And Stability AI, the creator of the well-known Stable Diffusion tool, is being sued by Getty Images for copyright infringement for using millions of images without the proper licensing.
AI data collection is too hard to do ethically
· 5 min read