/*

For certain inputs (like artist name or labels), we can't allow non-latin characters
(Japanese, Chinese, Hindi, etc.) because they are impossible to search for, don't
translate into url-safe strings and thus are virtually inaccessible to the majority
of roman-alphabet based users. This function allows us to check for non-latin characters.

To avoid unnecessarily invalidating strings where only one or a few characters can't
be processed, we look at the number of non-latin characters in the context of the
whole string.

Strict Logic:
•	In strict mode, we only allow word & number characters from 'Basic Latin'.
•	Used for for things like labels, were we never want any odd characters ever.

Default logic:
•	Used for inputs like artist names.
•	If there is at least one non-Latin character, we return true.
•	In normal mode, we allow characters from 'Basic Latin', 'Latin-1 Supplement',
	'Extended Latin A', 'Extended Latin B' and 'Latin Extended Additional'.
	Extended Latin C & -D we lump together with non-Latin characters as they can't
	be normalized (see below).
•	We check the string before and after normalization. If any of the characters
	can't be normalized, we invalidate the string. So Đorđević won't pass because
	Đ and đ don't normalize, but Sağănēç will pass because all diacritics fold.



More info
- - -
Unicode character distribution overview
http://www.unicode.org/charts/

Stack overflow:
https://stackoverflow.com/a/24163970/2262741



Foreign scripts to test
- - -
Chinese:	在中國有大熊貓
Japanese:	日本には寿司があります
Hindi:		भारत में हाथी हैं
Arabic:		في شبه الجزيرة العربية توجد الجمال
Hebrew:		בישראל יש פשעי זכויות אדם
Urdu:		پاکستان میں انہیں چائے پسند ہے۔
- - -
Latin with non-foldable diacritics:		Đorđević	--> won't pass
Latin with foldable diacritics:			Sağănēç		--> will pass



About Normalization
- - -
Extended latin characters A, B & Additional can be mostly normalized into basic
latin characters (using <string>.normalize('NFD') --> see urlify()) so we include
them into our accepted characters list. Extensions C & D are not included because
they can't be normalized and thus we consider them foreign script.

Breakdown:
- - -
Basic Latin:
Range:			0000-007F
Normalizable:	65 / 93 (69.9%)
Characters:		!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`ab
				cdefghijklmnopqrstuvwxyz{|}~

--
Latin Supplement:
Range:			0080-00FF
Normalizable:	53 / 95 (55.8%)
Characters:		¡¢£¤¥¦§¨©ª«¬®¯°±²³´μ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâ
				ãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ

--
Extended A:
Range:			0100-017F
Normalizable:	109 / 129 (84.5%)
Characters:		AĀāĂăĄąĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪīĬĭĮįİıĲĳĴĵĶķĸĹĺĻļĽľĿŀ
				ŁłŃńŅņŇňŉŊŋŌōŎŏŐőŒœŔŕŖŗŘřŚśŜŝŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽžſ

--
Extended B:
Range:			0180-024F
Normalizable:	83 / 208 (40%)
Characters:		ƀƁƂƃƄƅƆƇƈƉƊƋƌƍƎƏƐƑƒƓƔƕƖƗƘƙƚƛƜƝƞƟƠơƢƣƤƥƦƧƨƩƪƫƬƭƮƯưƱƲƳƴƵƶƷƸƹƺƻƼƽƾƿǀǁǂ
				ǃǄǅǆǇǈǉǊǋǌǍǎǏǐǑǒǓǔǕǖǗǘǙǚǛǜǝǞǟǠǡǢǣǤǥǦǧǨǩǪǫǬǭǮǯǰǱǲǳǴǵǶǷǸǹǺǻǼǽǾǿȀȁȂȃȄȅ
				ȆȇȈȉȊȋȌȍȎȏȐȑȒȓȔȕȖȗȘșȚțȜȝȞȟȠȡȢȣȤȥȦȧȨȩȪȫȬȭȮȯȰȱȲȳȴȵȶȷȸȹȺȻȼȽȾȿɀɁɂɃɄɅɆɇɈ
				ɉɊɋɌɍɎɏ

--
Extended: C:
Range:			2C60-2C7F
Normalizable:	0
Charcters:		ⱠⱡⱢⱣⱤⱥⱦⱧⱨⱩⱪⱫⱬⱭⱮⱯⱰⱱⱲⱳⱴⱵⱶⱷⱸⱹⱺⱻⱼⱽⱾⱿ

--
Extended D:
Range:			A720-A7FF
Normalizable:	0
Characters:		꜠꜡ꜢꜣꜤꜥꜦꜧꜨꜩꜪꜫꜬꜭꜮꜯꜰꜱꜲꜳꜴꜵꜶꜷꜸꜹꜺꜻꜼꜽꜾꜿꝀꝁꝂꝃꝄꝅꝆꝇꝈꝉꝊꝋꝌꝍꝎꝏꝐꝑꝒꝓꝔꝕꝖꝗꝘ
				ꝙꝚꝛꝜꝝꝞꝟꝠꝡꝢꝣ꟪꟫ꝦꝧꝨꝩꝪꝫꝬꝭꝮꝯꝰꝱꝲꝳꝴꝵꝶꝷꝸꝹꝺꝻꝼꝽꝾꝿꞀꞁꞂꞃꞄꞅꞆꞇꞈ꞉꞊ꞋꞌꞍꞎꞏꞐꞑꞒꞓꞔꞕꞖꞗꞘꞙꞚꞛꞜꞝꞞ
				ꞟꞠꞡꞢꞣꞤꞥꞦꞧꞨꞩꞪꞫꞬꞭꞮꞯꞰꞱꞲꞳꞴꞵꞶꞷꞸꞹꞺꞻꞼꞽꞾꞿꟀꟁꟂꟃꟄꟅꟆꟇꟈꟉꟊꟐꟑꟓꟕꟖꟗꟘꟙꟲꟳꟴꟵꟶꟷꟸꟹꟺꟻꟼꟽꟾꟿ

--
Extended Additional:
Range:			1E00-1EFF
Normalizable:	244 / 256 (95.3%)
Characters: 	ḀḁḂḃḄḅḆḇḈḉḊḋḌḍḎḏḐḑḒḓḔḕḖḗḘḙḚḛḜḝḞḟḠḡḢḣḤḥḦḧḨḩḪḫḬḭḮḯḰḱḲḳḴḵḶḷḸḹḺḻḼḽḾḿṀṁṂṃṄ
				ṅṆṇṈṉṊṋṌṍṎṏṐṑṒṓṔṕṖṗṘṙṚṛṜṝṞṟṠṡṢṣṤṥṦṧṨṩṪṫṬṭṮṯṰṱṲṳṴṵṶṷṸṹṺṻṼṽṾṿẀẁẂẃẄẅẆẇẈẉ
				ẊẋẌẍẎẏẐẑẒẓẔẕẖẗẘẙẚẛẜẝẞẟẠạẢảẤấẦầẨẩẪẫẬậẮắẰằẲẳẴẵẶặẸẹẺẻẼẽẾếỀềỂểỄễỆệỈỉỊịỌọỎỏ
				ỐốỒồỔổỖỗỘộỚớỜờỞởỠỡỢợỤụỦủỨứỪừỬửỮữỰựỲỳỴỵỶỷỸỹỺỻỼỽỾỿ




REGEX BREAKDOWN
- - - - - - - -

\u0000-\u007F
- - -
•	Basic Latin

\u00C0-\u024F
- - -
•	Latin-1 Supplement starting at \u00C0 or À (we invalidate the non-linguistic characters)
•	Latin Extended-A
•	Latin Extended-B

\u1E00-\u1EFF
- - -
•	Latin Extended Additional

*/

// Check if any characters in the submitted string are non-roman.
// Returns true if any match is found.
module.exports = function isNonLatin(str, options) {
	if (!str) return false
	// "strict" option doesn't allow any odd characters, default allows extended latin.
	// "total" option only returns true is ALL characters are non-latin (used to detect foreign script).
	const { strict, fully } = options || {}

	const isBasicLatin = !str.match(/[^\u0000-\u007F]/)
	const isExtendedLatin = !str.match(/[^\u0000-\u007F\u00C0-\u024F\u1E00-\u1EFF]/)
	if (isBasicLatin) {
		if (strict) return !!str.match(/[^a-zA-Z0-9&+\-\s]/)
		if (fully) return false
		return false
	}
	if (isExtendedLatin) {
		if (strict) return true
		if (fully) return false
		// Fold diacritics - https://stackoverflow.com/a/37511463/2262741
		const folded = str.normalize('NFD').replace(/[\u0300-\u036f]/g, '')
		// Remove all basic latin characters so we can process the extended ones.
		const nonBasic = folded.replace(/[\u0000-\u007F]/g, '')
		// Remove non-word characters
		const nonWord = nonBasic.replace(/[\w\s-]/gi, '')
		// console.log('folded', folded)
		// console.log('nonBasic', nonBasic)
		// console.log('nonWord', nonWord)
		// console.log(nonWord.length)
		return !!nonWord.length
	} else if (fully) {
		str = str.replace(/\s+/g, '') // Remove whitespace
		const noLatin = !str.match(/[\u0000-\u007F]/)
		const noExtendedLatin = !str.match(/\u0000-\u007F\u00C0-\u024F\u1E00-\u1EFF/)
		console.log(7, noLatin, noExtendedLatin)
		return noLatin && noExtendedLatin
	} else {
		return true
	}
}
