27 points | by franze 5 days ago
6 comments
Cool! Reminds me of Evan's ocr script:
https://evanhahn.com/mac-ocr-script/
The ocr example says it recognizes Chinese, but output ignores it - maybe just AI bug in generated examples
This is neat. I wonder… Is there a more comprehensive analysis of how well the Apple Vision framework compares to other multi-modal AI? Would there be any benefit to pre-processing images via Auge before handing them to Claude, GPT?
to find this out thats why i built it
just released a new version https://auge.franzai.com/ that might have some more value now
very interesting, keep up!
Very cool
Cool! Reminds me of Evan's ocr script:
https://evanhahn.com/mac-ocr-script/
The ocr example says it recognizes Chinese, but output ignores it - maybe just AI bug in generated examples
This is neat. I wonder… Is there a more comprehensive analysis of how well the Apple Vision framework compares to other multi-modal AI? Would there be any benefit to pre-processing images via Auge before handing them to Claude, GPT?
to find this out thats why i built it
just released a new version https://auge.franzai.com/ that might have some more value now
very interesting, keep up!
Very cool