Large data sources with bad data vs. Small data sources with good data.

Large data sources with bad data. Smaller data sources with good data. Which is better?

Large bad data. Obviously not good and some have recognized this as “model collapse” – the bad data causes more bad data to be generated. Small bad data – nothing needs to be said here. Small (vetted, provenance-known) data – perhaps within your enterprise walls – perhaps the way to go, until large, good data that is used for training appears. When will this happen? I do not think this is on the horizon.

Audible
Apple Podcasts
Spotify
Previous
Previous

Investigate Microsoft for Negligent Cybersecurity? Or look at the software development community and do root cause analysis.

Next
Next

Outputs and Outcomes - What are you producing from your Architecture Efforts?