@cjd reddit already exists, why would you bother a graphics card to generate midwit slop?
@cjd archive team distributes docker containers that basically scrape reddit all day, and they seem to hit it p hard. you could probably get away with quite a bit before getting throttled.
@cjd @Moon
Obvious jokes aside, the problem is, that you cannot create a dataset by humans, since humans are incapable of making this distinction themselves.
The entire concept of schizophrenia and inteligence being 2 sides of the same coin does apply here 10fold. Because, briliant people do see paterns that you cannot visualize, that means, that you cannot know, if they actually are smart, or if they are bullshitters.
This is, why most atempts at doing this end up just with recognizing, how niche words you use, since the niche words are needed to make a scientific article. But, you immediately turn to social science loons, who cannot form a single sentence without going full systemic prejudice against margenalized methaphors for cheese.
@cjd @Moon
> Space and time are the same thing
Literally is either the most briliant thought of 20th century or something a local greenhead said after his 5th joint.
> All matter is created from energy
Now, you're either talking to quantum physicist, who understands the nature of matter, anti-matter etc, or you're talking to Teal Swan.
See the issue?
@Moon@shitposter.club @cjd@pkteerium.xyz No they promote fake articles https://www.theverge.com/23753963/google-seo-shopify-small-business-ai
It's not clear why, though.
I suppose it might be something like they have an (AI) algorithm, and people reverse-engineer how it works to some extent, and provide content that does not necessarily make sense but makes the algorithm to give them higher ranking, and then they train event AI to generate such content.
The result is AI talking to AI, and the need for people to find some other place to look for information.
But, and this is a simple implementation: Suppose you use a text classifier on the individual sub-phrases, then for each one of those, you output neural layer snapshot represented as an image, then you take the images making up a sentence, and you feed them through a net to pattern recognize similar sentences and again you're outputting a neural snapshot as an image.
At each level of this, you can train using 2 similar phrases and one different. The reward function is based on the neural image of the similar phrases being more similar (XOR of the pixels is less) than the different one.
Feed those images back in, this time per-paragraph, and you should have a form of paragraph level classification. Then you feed that output into a network which which classifies text into a score and you train on things you find worthwhile.
@cjd @Moon
And to finish this, take conspiracy theories.
They are theories. By definition, they are something, that people who believe in them weren't able to disprove.
But, when you talk to a normie, they take the word theory, and believe, that it means by definition, that it has to be false. This is the inteligence on the 100IQ level. But, it is based on a deeper problem. That is, that we are working with an insanely limited amount of measurable data. And AI can only compate concepts to concepts, but It cannot compare concepts to data. That means, that it has limit to it's potential only to things, that humans have already measured.
BTW humans have a way of signaling and detecting intelligence - that is through humor. It's like the first man-made proof-of-work: It requires more brain cells to be funny (prove) than it does to laugh (validate).
@Moon @cjd "Google is incredibly bad at internal stuff."
One of the reasons https://killedbygoogle.com/ et. al. are a thing is that aside from a couple of smallish git repoes, like maybe one for their version of Linux?, everything is in a huge monorepo and unless something is constantly maintained it'll rot.
You don't get rewarded for pure maintenance of anything that's not huge, while new product launches are the best way most of their people can get rewards, promotions (AKA more $$$, and promotion opportunity is stack ranked) etc. Also see how fast they sunset Google Cloud offerings, one reason it's a very distant third or forth to AWS and Azure.
We're also told this is why Google Reader was killed, which was an inflection point in perception for the company. The thrived in Ballmer stack ranking Microsoft pajeet they hired for Google+ with a remit to run roughshod over everything else would have had to expend effort to keep it working as the repurposed some of its stuff for what was a while the only project after advertising that mattered.
https://steve-yegge.medium.com/dear-google-cloud-your-deprecation-policy-is-killing-you-ee7525dc05dc
> isn't based around the word structure
Well, yes and no. REALLY stupid text has an identifiable structure. Midwit text looks smarter than it is. What I'm looking for is how to make the model deep enough to identify quality fedi banter.
Of course midwit diarrhea is a moving target w/ Goodhart's law, especially if people are start training GPTs against my "quality posts" classifier...
@cjd @Moon
Also, just consider the biggest problem with intelligence is in fact communication. Smart people do see a problem in it's entierity, where describing route from A to B is a language problem with exponential growth or words needed to describe each step and needed connections. While this is not a problem for AI (for obvious reasons), It shows exactly, why humans cannot create the dataset.
We were so confident about this issue not existing, that we even created a logical fallacy to describe logic too advanced for our understanding. We call it a slippery slope.
@Moon @cjd
This guy wrote his own OS. Does he sound "smart" though?
https://www.youtube.com/watch?v=3CC8EopC4hU
@cjd @Moon
If you want to make an AI to learn what interests YOU, then even the dumbest "find words I like" system will do the job. As long as the text contains "linux, boot, freeware, software, hardware", It's great.
If it contains "republican, democrat, trump, fuck, Nigger" It's BAD.....
But you changed the goal entierly now. Your original post was about finding intelligence.
But the thing is, I can't think of any particular set of words that communicate what I would find interesting - it's like trying to word-filter for what you'll find funny. What do you do? Filter for "knock knock"? Maybe if you're 5.
Most political takes are boring and repetitive. Most Science is horrifying midwittery. Most blockchain takes are spam and get-rich-quick. Most conspiracy takes are aliens and flat earth bullshit. BUT, there's 1% in each category which is a flash of brilliance (IMO) and I'd really like to try to filter for it...
@Moon @LukeAlmighty @cjd Einstein's "misguided opposition to quantum physics" was I think more opposition to the "Copenhagen interpretation" than all of quantum mechanics, which I'm restudying right now, Thirty Years That Shook Physics which I last read in the 1980s.
As in, per (((Otto Robert Frisch's))) autobiography (he was a top experimentation of the era and the guy who asked the key question which led to practical atomic bombs), Einstein is credibly claimed to be the first scientist to take Max Planck seriously. Although you could possibly add "... and get results."
Plank just about tore his hair out (seriously, look at photos of him around this time) trying to explain the emission of light, and came up with the idea light is packaged up in quanta, small packets of energy we now call photons. He was using evidence from emission, and of course assumed absorption was the same.
Einstein took the empirical, experimental laws of the photoelectric effect where light hitting metal throws off electrons, and explained how quanta of light perfectly explained that thus closing the circle, which really got the ball rolling. So important that was the specific thing cited for his Nobel.
It's pretty amazing stuff from a historical perspective. The next theoretician to make headway was (((Niels Bohr))) and per the book one of the assumptions he made was that hydrogen had only one electron. Nobody knew!
have it exclude writing that include too many adjectives
Provided the AI understands the concept of an adjective and its relationship to a noun
Can this AI read books? If so, then have the AI read certain "men of letters" (blank slate AI).
Perhaps beforehand, have it learn to distinguish verbose writing (too many adjectives, too many adverbs, and buzzwords)
A bank of buzzwords can be maintained quite easily, and have any pop-culture article containing these words be flagged as stupid or low-brow. (This is giving the AI some agency.)
Anyway. My wheelhouse is mathematics and Tech. Comm. and not software engineering, so throw rocks if you like.
@LukeAlmighty @Moon
Training an AI on quality writing is interesting, but tweets and fedi posts are perhaps better for training because they're short and so it doesn't take reading over pages and pages of long words to determine whether you're dealing with genius or reddit-tier poop.
@cjd @pepsi_man @Moon
I am sorry, but I thought you are very educated when it comes to the IT.
So, why would you believe, that less information is good for learning?
thabkfully you improve them both when ur young probably harder when ur older but it csn srill happen tho