Understanding false positives within Turnitin’s AI writing detection capabilities

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello again I'm David Adamson an AI scientist at turnitin and a former High School teacher so turnitin is getting ready to share an AI writing sector with our users with you the instructors so that you can engage with and understand how your students are beginning to use AI writing tools it's important for us to be clear with you about the reliability of our predictions we have decided to prioritize precision in our detector that is if we say that a document is AI writing in it we want to be pretty sure about that preferring Precision might mean we miss some AI writing that's really there we might have a lower recall we're fine with that let's Miss some stuff and be more right about what we find our evaluation set is a big bag of documents representing as best we can the many ways people can write in an academic context and the way they're using AI writers perhaps mixed in with their own authentic writing we're using this data to set a threshold on our predictions only counting text is AI written if its detection score meets our high Precision Target when we're wrong it will be an under prediction more often than not except when we don't because sometimes we won't we expect we'll be wrong about one out of a hundred fully human written documents that is a false positive rate of about one percent well that's pretty good but it's not zero and that means you'll have to take our predictions as you should with the output of any AI powered feature from any company with a big grain of salt you the instructor have to make the final interpretation you know the student you know the context and it's important for you to know how and when we might be wrong so here's some sense of the flavor of our false positives first of all repetitive writing if a text substantially repeats itself either word for word or closely paraphrasing what came before it may get predicted as AI writing when it's just super redundant second and relatedly our detector is meant for paragraphs English language prose not for lists outlines short questions code or poetry sometimes such submissions have a lot of self-similarity from item to item and they don't read like paragraphs and that is going to cause us to stumble so this prompts a big question what about developing writers English language Learners whose writing might be more redundant well we very purposefully oversample from such writing in both our training data and our evaluation set despite this Real Talk our false positive rate is slightly higher for secondary level writing for middle and high school students than it is for higher education still near our one percent Target but there is a difference we'll be working on this however happily we don't yet see any evidence that we're biased against English language Learners from any country at any level and that's the sort of thing we're going to keep a close eye on as we move towards production so I hope this gives you all a sense of how we're approaching this task we want to own our mistakes we want to understand and share with you how and when we're wrong we're leaning in for precision and fairness even if it means we might miss some AI writing that's there thank you
Info
Channel: Turnitin
Views: 12,862
Rating: undefined out of 5
Keywords: turnitin, grading, plagiarism, edtech, education, turn it in
Id: 4e9zM2MZvRQ
Channel Id: undefined
Length: 3min 37sec (217 seconds)
Published: Tue May 23 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.