We seem to be at an interesting feedback loop point for ML models.
We have textual and visual models strong enough to look at a picture and describe it, or to read a body of text and categorize and label it well.
That work can then be used to feed the training of other models, whether it's larger models that can make use of an ever-growing well labelled corpus, or smaller and more efficient models trained on sets that were previously too niche or cost prohibitive to individually label.
With this coming out of Microsoft for copilot/Bing and the strength of recent smaller models which have had the benefit of GPT4 and other larger LLMs to assist with training, we appear to be at an inflection point of training quality at the same time as training compute is being massively scaled.
When GPT4 gets it wrong… all the other LLMs do as well. The error gets reinforced when you ask a quorum of LLMs a question and a majority can trace the same wrong answer back to GPT4.
And you thought getting something out of googles index was hard.
Try removing it from the weights of a bunch of LLMs!
Sure, doing surgery on the weights is not feasible, but doing surgery on the training data is possible (and is done for certain benchmarks).
That's where it gets interesting. If you tell the LLM what the bad data is, it could examine its own training set and fix the mistakes. Get enough LLMs with diverse enough training sets, and they will correct each other.
If models are stochastic parrots, you might expect training data to devolve into corruption via a negative feedback loop, but the opposite might happen if models are actually intelligent.
> Extensive experiments using both human and automatic evaluation metrics demonstrate that TnT-LLM generates more accurate and relevant label taxonomies when compared against state-of-the-art baselines, and achieves a favorable balance between accuracy and efficiency for classification at scale.
It sounds like it could be used very effectively to create even more detailed databases of surveillance on us, by taking every type of unstructured thing we write online and making a massive database out of it. Then the information will be put into models and used to predict our future behaviour. Excellent for insurance companies and law enforcement to further subjugate and restrict human behaviour.
… Or an accounting firm can do this to Enron’s books and it tells them where the fraud is. Or an aircraft manufacturer can throw in all the product documentation for their aircraft and all of their parts suppliers and the maintenance records too and turn it into a database. Or an international regulator could do the same for air traffic control regulations and incident reports. Or a global beverage manufacturer or a clothing and footwear company can do the same for their social media interactions. Or a gamer can do it for a video game walkthrough and ask “What do I have to to build item X?”
So far as surveillance goes, about ten years ago I went to a conference where there was a workshop on text analysis put on by no other than Edward Snowden’s bosses boss. Except few of us knew who Edward Snowden was until a few weeks later!
That was the beginning of the phase where I talked to all sorts of folks about how to do text analysis and, despite finding out about a few projects where people’s heads were well underwater, found that people who worked in the research triangle park area working for three-letter agencies had a lot to learn from, but not so much silicon valley folks, academics, etc. Also some companies you’ve heard of and even more you hadn’t heard of from India were doing big projects with Apache UIMA and half a battalion worth of annotators.
> … Or an accounting firm can do this to Enron’s books and it tells them where the fraud is. Or an aircraft manufacturer can throw in all the product documentation for their aircraft and all of their parts suppliers and the maintenance records too and turn it into a database. Or an international regulator could do the same for air traffic control regulations and incident reports. Or a global beverage manufacturer or a clothing and footwear company can do the same for their social media interactions. Or a gamer can do it for a video game walkthrough and ask “What do I have to to build item X?”
An excellent argument that global society has become too complex and we should return to a greater dependence on local communities and less complexity.
Instead of using AI, let's just take Enron down, reduce air travel, remove international trade except for some very crucial items, and DESTROY global beverage and clothing manufacturers. A pox on society those last two are.
As for video games, that seems like a stretch. There are plenty fine video games that were made before AI.
We ran it on 1 million LLM queries and posted the results here.
https://huggingface.co/datasets/lamini/earnings-calls-qa
These pipelines can be expensive. Running this on Claude 3 Opus would cost about $50,000