Snippets Project Page
Author: maroon
Added: 1y
Updated: Never
mIRC: 7.48+
Hits: 363
Downloads: 23
Review: Jawsh
Size: 1.97KB
1 0
Login to vote.
$isutf8
v1.0
Tests whether &binvar contains a UTF8 string. Assuming $utfencode $utfencode don't lie
//bset -c &v 1 195 169 | echo -a $isutf8(&v) | //bset -c &v 1 193 169 | echo -a $isutf8(&v)
Download
JSON
▲ Review
▼ Source
/* { $isutf8 by maroon (c) 2022, syntax: $isutf8(&binvar) returns $true or $false if &binvar contains a valid UTF8 string according to $utfencode/$utfdecode if this returns a wrong answer, either there's a bug in $utfencode or $utfdecode or $bvar or $regsubex or $sha1, or congratz for finding an $sha1 collision! known issues: + using $sha1 against strings > 500 to support versions with $maxlenl at 4k + reports utf8 surrogates as $false, so a &binvar containing the 6 byte utf8 encoding of 2 utf8 surrogates for an emoji reports $false, but containing the 4 byte utf8 encoding of the emoji's codepoint returns $true + v6.35 falsely reports single byte values in 128-255 as valid utf8, and doesn't correctly handle codepoints > 255 if needing this to work prior to v7.44 replace the 1st line with this next line: if (!$0) goto syntax | if (($version >= 7.44) && (!$bvar($1))) goto syntax if needing this to work prior to v7.48 replace the line containing $regsubex with the next 2 lines: if ($version >= 7.48) noop $regsubex(foo,$utfdecode($utfencode($bvar($1,1-).text)),,,&maroon.isutf8b) else { bset -tc &maroon.isutf8b 1 $encode($utfdecode($utfencode($bvar($1,1-).text)),m) | noop $decode(&maroon.isutf8b,bm) } } */ alias isutf8 { if ((!$0) || (!$bvar($1))) goto syntax if (!$bvar($1,0)) return $false bcopy -c &maroon.isutf8a 1 $1 1 -1 noop $regsubex(foo,$utfdecode($utfencode($bvar($1,1-).text)),,,&maroon.isutf8b) if ($bvar(&maroon.isutf8a,0) != $bvar(&maroon.isutf8b,0)) return $false if ($bvar(&maroon.isutf8a,0) <= 500) { if ($bvar(&maroon.isutf8a,1-) != $bvar(&maroon.isutf8b,1-)) return $false } else { if ($sha1(&maroon.isutf8a,1) != $sha1(&maroon.isutf8b,1)) return $false } return $true :syntax echo -sc info * invalid parameter: $ $+ isutf8 | halt }
Changelog:
0
0