tamu datathon experience

ok so i was going to write about this as soon as it happened, but got a bit carried away with other things. but firstly, these are just my thoughts of how i felt and not to be taken out of context. and for a little bit more context, tamu datathon is a hackathon style event with data science oriented challenges hosted by an organization at texas a&m. by the way, a lot of this is kind of a rant, but what's the point of a blog if you can't be honest even when it's not always positive?

i went into datathon thinking that i was going to attempt an easy challenge and not care too much about it. i had no anticipation of finishing any challenge or really competing because of a previous datathon experience (to sum it up, my team's first place submission got deleted by the challenge owner). you would think i would have learned my lesson once, but once you get there, that all kind of changes. naturally, everyone is competitive at heart, right? you sort of believe anything is possible. and so akshay and i attempted one of the harder challenges.

before the challenges were even officially released, i found a way to get access to all their google drive links for the challenges (so we saw everything about an hour in advance). this is just to foreshadow how poorly the whole event is setup.

anyways, the goal was to create a model to beat the new york times connection game. first of all, this has not even been achieved as proven by multiple researchers and current llms are the closest thing possible to beating the game. after asking a question about the challenge, right away the challenge owner is practically boasting about how his model performed for absolutely no reason and makes a claim that's very likely false. for some reason, anytime i try and ask the challenge owner a question (same thing has happened in the past), he seems to give a very negative/rude response.

they were supposed to release a leaderboard within 2 hours of the challenge starting, which turned into a few more hours, which turned into later in the night, which turned into the morning, which turned into no more leaderboard. it turned into a github submission where it runs through some pipeline that runs the code in a docker container on their end. it turns out, this pipeline didn't work for most people because they didn't submit it correctly but worked for a couple teams (including ours). for two challenges they couldn't get working on the actual closing ceremony day, so they decided to postpone figuring out challenge winners to the next day so they could help people fix their code (this shouldn't have even happened). when they presented the challenge winners the next day during a livestream, they got through the first challenge with a few minor issues taking about 3 hours of everyones time and then when it came time to release the second challenge results (the one i participated in) they claimed to have accidentally deleted the database. so it got postponed even further and somehow they arrived at results within 20 minutes after that.

it turns out that we didn't place 1st, 2nd, or 3rd. you could call me a sore loser, but i objectively didn't believe that our code didn't beat everyone else. i went to everyones repository and tested their code against the testing data in which they all performed significantly worse than ours. since we used an llm, nothing could've been trained on the testing data so it couldn't have been the result of overfitting. and even just by taking a quick look at all the winners code and how it was written, it was impossible they outperformed what we had.

i tried asking the challenge owner to just give us a simple score (they had a scoring system for the game) of how our code performed and they couldn't even provide that information. they claimed that they would talk in more depth about the challenge in their "blog" and provide more insights later (which hasn't happened). everything they said seemed to be lies built on more and more lies because they know they messed up. this whole experience just rubbed me the wrong way. you would think that objective based challenges are pretty straightforward but seemingly not so.

all this to say that this event was very poorly run and has been running poorly for the past couple years. and it seems to be stemming from the same person who runs the challenges every year. it also has to do with poor leadership by putting full faith into the wrong person.

this whole blog was kind of a rant, but needless to say i believe the organization needs to be run better. i'd hope something changes in the next year as people graduate so participating in these events can be enjoyable. i love building things and so when something like this happens, it's quite frustrating. but hopefully lesson learned this time :)