marco cognetta theoretically good with computers

recent updates

  • (2024-10) I'm on Bluesky @mcognetta.bsky.social.

  • (2024-09) Team 🍞 (Tyler Woodruff, Oleg Filatov, and myself) won the IEEE BigData Cup: Predicting Chess Puzzle Difficulty Challenge. The camera-ready paper and code will be available soon, and will be presented at IEEE BigData.

  • (2024-09) My paper, Distributional Properties of Subword Regularization (with VilΓ©m Zouhar and Naoaki Okazaki) was accepted to EMNLP. The preprint is here and the camera-ready version will be available soon.

  • (2024-02) My paper, Two Counterexamples to Tokenization and the Noiseless Channel, was accepted at LREC-COLING 2024. The preprint is here and the camera-ready version and code will be available soon.

    • The camera-ready version is here and the code is here.

  • (2023-07) I presented LotteryTickets.jl: Sparsify Your Flux Models at JuliaCon2023. The recording is here, and the slides and repo are here.

  • (2023-05) I presented Parameter-Efficient Korean Character-Level Language Modeling at EACL2023. The paper can be found here.

  • (2022-11) I've joined Mastodon on the sigmoid.social instance, which is focused on the ML/AI research community. My profile is @mc@sigmoid.social.

  • (2022-04) I've moved to Tokyo to join the Okazaki Lab for my PhD. I'll continue working part time at Google Tokyo.

recent posts (all posts)

about me

I am currently a PhD student in NLP in the Okazaki Lab at the Tokyo Institute of Technology, and a PhD Student researcher at Google Tokyo on the Gboard team. Prior to this, I was a software engineer at Google, also on Gboard. I did my MS in Computer Science at Yonsei University and my BS in Discrete Mathematics with a minor in Korean at Georgia Tech.

I am always open to chatting about interesting topics. Please feel free to send me an email ([lastname].[firstname]@gmail.com).

I am (not exhaustively) interested in:

  • Automata Theory

  • Scientific Computing

  • Languages (especially Korean and Esperanto)

  • Combinatorics

  • Open Source Software

  • High School Level Computer Science Education

μžκΈ°μ†Œκ°œ

μ €λŠ” λ„μΏ„κ³΅μ—…λŒ€ν•™(Tokyo Institute of Technology)의 Okazaki μ—°κ΅¬μ‹€μ˜ λ°•μ‚¬ν•˜μ • 학생이고 ꡬ글 μ§€λ³΄λ“œ(Gboard)νŒ€μ˜ 개발자 마λ₯΄μ½”μž…λ‹ˆλ‹€. μ‘°μ§€μ•„ν…μ—μ„œ μ΄μ‚°μˆ˜ν•™μ„ 전곡, ν•œκ΅­μ–΄λ₯Ό λΆ€μ „κ³΅ν•˜μ˜€κ³  이후에 μ—°μ„ΈλŒ€ν•™κ΅μ˜ 계산이둠 μ—°κ΅¬μ‹€μ—μ„œ 컴퓨터 κ³Όν•™ μ„μ‚¬ν•™μœ„λ₯Ό μ™„λ£Œν•˜μ˜€μŠ΅λ‹ˆλ‹€.

ν₯미둜운 μ£Όμ œκ°€ μžˆλ‹€λ©΄, μ–Έμ œλ“  λˆ„κ΅¬μ™€λ„ 이야기 λ‚˜λˆ„κ³  μ‹ΆμŠ΅λ‹ˆλ‹€. μ΄λ©”μΌλ‘œ μ—°λ½ν•΄μ£Όμ„Έμš” ([μ„±].[이름]@gmail.com).

μ œκ°€ 특히 μ’‹μ•„ν•˜λŠ” μ£Όμ œλ“€μ€ λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€:

  • μ˜€ν† λ§ˆνƒ€ 이둠

  • μˆ˜μΉ˜ν•΄μ„ν•™

  • μ–Έμ–΄ (특히 ν•œκ΅­μ–΄ν•˜κ³  μ—μŠ€νŽ˜λž€ν† )

  • μ‘°ν•©λ‘ 

  • μ˜€ν”ˆμ†ŒμŠ€ μ†Œν”„νŠΈμ›¨μ–΄

  • 고등학ꡐ μˆ˜μ€€μ˜ μ»΄ν“¨ν„°κ³Όν•™κ΅μœ‘

λͺ¨λ“  κ²Œμ‹œλ¬Όμ€ 주둜 μ˜μ–΄λ₯Ό μ‚¬μš©ν•΄μ„œ μ“°κ³  μžˆμ§€λ§Œ, ν•œκ΅­μ–΄λ₯Ό μ—°μŠ΅ν•˜κΈ° μœ„ν•΄μ„œ 가끔 ν•œκΈ€ κ²Œμ‹œλ¬Όμ„ μž‘μ„±ν•˜κ±°λ‚˜ μ˜μ–΄ κ²Œμ‹œλ¬Όλ“€μ„ ν•œκ΅­μ–΄λ‘œ λ²ˆμ—­ν•΄μ„œ 올리고 μžˆμŠ΅λ‹ˆλ‹€.