Have had a few pet projects in the past around RSS aggregation/news reading, which could fact-check the sources/article while reading, also determining the biases from the grammar and linguistic patterns used by the journalist for the article. Same could be applied to comments.
Wonder if such a feature had value for a reader app for Lemmy? I feel a definitive score is toxic. But, if it were to simply display the variables to look out for it can help make a objective decision yourself?
Another application of this, is also pulling just the objective statements in the articles for faster reading.
Edit: More explained in this comment: https://lemmy.world/comment/1524807
Apart from what @AlataOrange has said, I think your “on-device model” would die of overload in its first 5 minutes of operation. Most comments are biased. Everyone has an agenda, whether they are conscious of it or not. If I want factual things, I’ll read the factual things elsewhere on the internet. If I want some buttery popcorn, I’ll microwave some and read the comments.
I guess it would only help for reading articles if anything. Or a comment that has a tone for informing such as “Actually, this is so and so because of so and so”. But, I see your point.
So, your software would go to the link provided (if there’s a link provided) and scan the text of the article for language that sounds biased. This is an interesting exercise in computer programming, but it wouldn’t be useful. Imagine the biased reaction of the user that wants or does not want the article to be judged “biased” by a computer program. I could just hear people muttering to themselves, “damn algorithm.” This is something software is getting better at, but it’s still not reliable. Take, for example, some software from my field: The kind that detects plagiarism. When I get student papers, I have to scan them through the plagiarism detector. After that, I have to inspect the ones that were flagged as “potential plagiarism.” I’ve had to use this type of software for over a decade, and it’s still problematic. I’ve had situations in which I found the plagiarism and the software did not. I’ve had countless situations in which the software found plagiarism but there was no plagiarism. So, I don’t know, your goals as a computer scientist are lofty. Still, I want you to keep your bias detecting software away from my reading in my day to day. Anyway, human beings either have the reading skills and knowledge about where to get the facts from or they do not. If they are ignorant enough to require a computer program to judge for them, they will question the software’s judgment, anyway, whether it’s right or wrong. Why? Everybody’s got an agenda.
Yeah that is completely understandable.
I guess it’s less of the standard “AI” that you may think of that simply just thinks of something and outputs something. But, has multiple preprocessing steps prior to detection and then post. So for instance, parsing an article by its sentences and analyzing the subjective statements such as “I feel great about XYZ”, would be flagged, while searching for statements that either back up such Claims with Data. Such as in the standard format of “Claim, Lead-in, Data, Warrant” in writing for example. Then, checking the data source recursively until it finds it is infact valid. Now this “validity” is threatening, because yeah that can be controlled. But, there can definitely be transparent and community led approaches to adjust what source is considered valid. Without resources an initial solution would be, creating a Person graph of these sources authors and/or mapping against a database of verifiable research repos such as JSTOR, finding linked papers mentioning the same anecdotes, or simply following a trail of links, until the link hit’s a trusted domain.
Then there is also the variable if all the sources were heavily weighted onto one side of the equation, where the topic can clearly have valid devil advocates/arguments. This is where bias can come in. Post processing would be finding possible “anti-arguments” to the claims and warrants (if available in there store of verifiable sources). The point is not to force a point, but to open the reader’s paradigm
I see how using “fact-checking” in my OP was pretty negative/controversial. But, there’s no sense of control of what is “morally right” or what is the “Capital T truth” trying to be imposed on my part as a computer scientist. I strongly agree that computer ethics need to be a focus. Seeing your perspective was a great take to keep in mind. But, the passion is mostly driven by the black-and-white culture of online opinions, hence your point about agenda.
deleted by creator
Anyways, I’d like to say we are kind of agreeing. Not sure what caused that aggression. I do think of things in a product sense, but that is the byproduct (no pun intended) of my learning environment. If we are talking about philosophy, I should definitely read up some more. But, the capital T truth understandings majorly came from my observations of David Foster Wallace’s book “This is Water”. I will expand on it and circle back to improve my writing so it communicates my thoughts better.
deleted by creator
I may have misinterpreted the tones then, likewise
This actually got me thinking quite a bit and was hoping you’d expand on it. Is it more directed to building things that are not driven by a personal truth?
deleted by creator
I said I don’t… And I said it’s not to find it, but to essentially provide the reader with the data points to do so on their own. Like I said in the OP:
I feel a definitive score is toxic. But, if it were to simply display the variables to look out for it can help make a objective decision yourself
deleted by creator
Sure, I will. But, I will wait for more perspectives before I move onto the next. It would be a major mistake to continue on this alone. the idea is to have a team to compensate for flaws that you are potentially observing.