Pybot Wiki
Advertisement

This page describes a user-fix which is ordinarily used with the replace.py script. The chances are that you will not be able to use this code directly on your wiki. It is an example of the kind of thing you can do with user-fixes. You will have to update the code on this page to fit your particular circumstance.

BritSpell
Creator: CzechOut
Code location: tardis:T:SBOT LIST
What it does: Corrects the vast majority of American spellings, giving words their proper British (OED) spelling
Complexity: Advanced

BritSpell is a user-fix that you can use with the replace.py script. It corrects for virtually every known difference between American and British spelling, converting Americanisms into standard, British spellings, according to the OED.

It has been in "manual mode" at w:c:tardis so that an extensive list of exceptions could be made. Thus, the fix has the subtlety to correct honor into honour, but it knows that Honor Blackman is not Honour Blackman.

The exception list will almost certainly not be of use to most users, since the exceptions are specific to the topic of Doctor Who.

Code[]


#BritSpell version 1.05
#Enforces BrEng spelling, with exceptions
#relevant to the Doctor Who universe 
#and usage found on tardis.wikia.com
#released under CC-BY-SA 3.0 license 
#by User:CzechOut
#Originally published: 1 November 2011
#Current version: 21 February 2014

fixes['BritSpell'] = {
    'regex': True,
    'recursive': True,
    'msg': {
        'en':u'Enforcing British spellings, as outlined at [[tardis:T:SPELL]]'
        },
    'replacements': [
        #AAAA#
        (r'([Aa])ccessoriz(.?)', r'\1ccessoris\2'),
        (r'([Aa])cclimitiz(.?)',r'\1cclimatis\2'),
        (r'([Aa])ccouterments',r'\1ccoutrements'),
        (r'( +)eon( +)',r'\1aeon\2'),
        (r'( +)eons( +)',r'\1aeons\2'),
        (r'([Aa])erogram( +)',r'\1erogramme\2'),
        (r'([Aa])erograms',r'\1erogrammes'),
        (r'( +)esthete(.?)( +)',r'\1aesthete\2\3'),
        (r'( +)esthetic(.?)( +)',r'\1aesthetic\2\3'),
        (u'( +)etiology',u'\1aetiology'),
        (r'( +)aging',r'\1ageing'),
        (r'([Dd])e(.?)aging',r'\1e\2ageing'),
        (r'([Aa])ggrandizement',r'\1ggrandisement'),
        (r'([Aa])goniz(.?)', r'\1gonis\2'),
        (r'([Aa])luminum', r'\1luminium'),
        (r'([Aa])mortize( +)',r'\1mortise\2'),
        (r'([Aa])mortiz(.?)',r'\1mortis\2'),
        (r'(.?)([Tt])heater(.?)',r'\1\2heatre\3'),
        (r'([Aa])nemi(.?)',r'\1naemi\2'),
        (r'([Aa])nesthesia',r'\1naesthesia'),
        (r'([Aa])nestheti(.?)',r'\1naestheti\2'),
        (r'([Aa])nalog( +)',r'\1nalogue\2'),
        (r'([Aa])nalogs',r'\1nalogues'),
        (r'(.?)([Aa])nalyze( +)',r'\1\2nalyse\3'),
        (r'(.?)([Aa])nalyz(.?)',r'\1\2nalys\3'),
        (r'([Aa])ngliciz(.?)',r'\1nglicis\2'),
        (r'([Aa])nnualized',r'\1nnualised'),
        (r'([Aa])ntagoniz(.?)',r'\1ntagonis\2'),
        (r'([Aa])pologiz(.?)',r'\1pologis\2'),
        (r'([Aa])ppall( +)',r'\1ppal\2'), 
        (r'([Aa])ppalls',r'\1ppals'),
        (r'([Aa])ppetiz(.?)',r'\1ppetis\2'),
        (r'([Aa])rbor(.?)',r'\1rbour\2'),
        (r'([Aa])rcheolog(.?)',r'\1rchaeolog\2'),
        (u'ardor',u'ardour'),
        (r'([Aa])rmor(.?)',r'\1rmour\2'),
        (r'([Aa])rtifact(.?)',r'\1rtefact\2'),
        (r'(.?)([Aa])uthoriz(.?)',r'\1\2uthoris\3'),
        (r'( +)([Aa])x( +)',r'\1\2xe\3'),
        #BBBB#
        (r'(.?)([Pp])edaled', r'\1\2edalled'),
        (r'(.?)([Pp])edaling', r'\1\2edalling'),
        (r'([Bb])aptiz(.?)',r'\1aptis\2'),
        (r'([Bb])astardiz(.?)',r'\1astardis\2'),
        (r'([[Bb]])attleax( +)',r'\1attlee\2'),
        (r'([[Bb]])alk(.?)',r'\1aulk\2'),
        (r'([[Bb]])edeviled',r'\1edevilled'),
        (r'([[Bb]])edevling',r'\1edevilling'),
        (r'(.?)([[Bb]])ehavior(.?)',r'\1\2ehaviour\3'),
        (r'([[Bb]])ehoove(.?)',r'\1ehove\2'),
        (r'([[Bb]])ejeweled',r'\1ejewelled'),
        (r'(.?)([Ll])abor( +)',r'\1\2abour\3'),
        (r'(.?)([Ll])abored',r'\1\2aboured'),
        (r'([Bb])eveled',r'\1evelled'),
        (r'([Bb])evies',r'\1evvies'),
        (r'([Bb])evy',r'\1evvy'),
        (r'([Bb])iased',r'\1iassed'),
        (r'([Bb])iasing',r'\1iassing'),
        (r'([Bb])inging',r'\1ingeing'),
        (r'([Bb])ougainvillea(.?)',r'\1ougainvillaea\2'),
        (r'([Bb])owdleriz(.?)',r'\1owdleris\2'),
        (r'([Bb])reathalyz(.?)',r'\1reathalys\2'),
        (r'([Bb])rutaliz(.?)',r'\1rutalis\2'),
        (r'(.?)([Bb])usses',r'\1\2uses'),
        (r'([Bb])ussing',r'\1using'),
        #CCCC#
        (r'([Cc])esarean(.?)',r'\1aesarean\2'),
        (r'([Cc])aliber(.?)',r'\1alibre\2'),
        (r'([Cc])aliper(.?)',r'\1alliper\2'),
        (r'([Cc])alisthenics',r'\1allisthenics'),
        (r'([Cc])analiz(.?)',r'\1analis\2'),
        (r'([Cc])ancelation',r'\1ancellation'),
        (r'([Cc])ancelations',r'\1ancellations'),
        (r'([Cc])anceled',r'\1ancelled'),
        (r'([Cc])anceling',r'\1ancelling'),
        (r'([Cc])andor',r'\1andour'),
        (r'([Cc])annibaliz(.?)',r'\1annibalis\2'),
        (r'([Cc])anibaliz(.?)',r'\1annibalisi\2'),
        (r'([Cc])anibalis(.?)',r'\1annibalis\2'),
        (r'([Cc])anoniz(.?)',r'\1anonis\2'),
        (r'([Cc])apitaliz(.?)',r'\1apitalis\2'),
        (r'([Cc])arameliz(.?)',r'\1aramelis\2'),
        (r'([Cc])arboniz(.?)',r'\1arbonis\2'),
        (r'([Cc])aroled',r'\1arolled'),
        (r'([Cc])aroling',r'\1arolling'),
        (r'([Cc])atalog( +)',r'\1atalogue\2'),
        (r'([Cc])atalogs( +)',r'\1atalogues\2'),
        (r'([Cc])ataloged',r'\1atalogued'),
        (r'([Cc])ataloging',r'\1ataloguing'),
        (r'([Cc])atalyz(.?)',r'\1atalys\2'),
        (r'([Cc])ategoriz(.?)',r'\1ategoris\2'),
        (r'([Cc])auteriz(.?)',r'\1auteris\2'),
        (r'([Cc])avilled',r'\1avilled'),
        (r'([Cc])aviling',r'\1avilling'),
        (r'(.?)([Gg])ram( +)',r'\1\2ramme\3'),
        (r'(.?)([Gg])rams',r'\1\2rammes'),
        (r'(.?)([Ll])iter( +)',r'\1\2itre\3'),
        (r'(.?)([Ll])iters',r'\1\2itres'),        
        (r'(.?)([Mm])eter(.?)',r'\1\2etre\3'),
        (r'([Cc])entraliz(.?)',r'\1entralis\2'),
        (r'(.?)([Cc])enter( +)',r'\1\2entre\3'),
        (r'([Cc])enters',r'\1entres'),
        (r'([Cc])entered',r'\1entred'),
        (r'([Cc])entering',r'\1entring'),
        (r'([Cc])hanneled',r'\1hannelled'),
        (r'([Cc])hanneling',r'\1hannelling'),
        (r'([Cc])haracteriz(.?)',r'\1haracteris\2'),
        (r'([Cc])heckbook(.?)',r'\1hequebook\2'),
        (r'([Cc])hili',r'\1hilli'),
        (r'([Cc])hiseled',r'\1hiselled'),
        (r'([Cc])hiseling',r'\1hiselling'),
        (r'([Cc])irculariz(.?)',r'\1ircularis\2'),
        (r'(.?)([Cc])iviliz(.?)',r'\1\2ivilis\3'),
        (r'([Cc])lamor(.?)',r'\1lamour\2'),
        (r'([Cc])langor',r'\1langour'),
        (r'([Cc])larinetist',r'\1larinettist'),
        (r'([Cc])ollectiviz(.?)',r'\1ollectivis\2'),
        (r'([Cc])oloniz(.?)',r'\1olonis\2'),
        (r'([Cc])olor(.?)',r'\1olour\2'),
        (r'(.?)([Cc])olored',r'\1\2oloured'),
        (r'(.?)([Cc])oloring',r'\1\2olouring'),
        (r'(.?)([Cc])oloriz(.?)',r'\1\2olouris\3'),
        (r'([Cc])ommercializ(.?)',r'\1ommercialis\2'),
        (r'([Cc])ompartmentaliz(.?)',r'\1ompartmentalis\2'),
        (r'([Cc])omputeriz(.?)',r'\1omputeris\2'),
        (r'([Cc])onceptualiz(.?)',r'\1onceptualis\2'),
        (r'([Cc])ontextualize(.?)',r'\1ontextualis\2'),
        (r'([Cc])oz(.?)',r'\1os\2'),
        (r'([Cc])ouncilor(.?)',r'\1ouncillor\2'),
        (r'([Cc])ounselor(.?)',r'\1ounsellor\2'),
        (r'([Cc])ounseling',r'\1ounselling'),
        (r'([Cc])ounseled',r'\1ounselled'),
        (r'([Cc])renelated',r'\1renellated'),
        (r'([Cc])riminaliz(.?)',r'\1riminialis\2'),
        (r'([Cc])riticiz(.?)',r'\1riticis\2'),
        (r'([Cc])rueler',r'\1rueller'),
        (r'([Cc])ruelest',r'\1ruellest'),
        (r'([Cc])rystalliz(.?)',r'\1rystallis\2'),
        (r'([Cc])udgeled', r'\1udgelled'),
        (r'([Cc])udgeling',r'\1udgelling'),
        (r'([Cc])ustomiz(.?)',r'\1ustomis\2'),
        (r'( +)([Cc])ipher(.?)',r'\1\2ypher\3'),
        #DDDD#
        (r'([Dd])ecentraliz(.?)',r'\1ecentralis\2'),
        (r'([Dd])ecriminaliz(.?)',r'\1ecriminalis\2'),
        (r'([Dd])efense(.?)',r'\1efence\2'),
        (r'(.?)([H,h])umaniz(.?)',r'\1\2umanis\3'),
        (r'(.?)([Dd])emeanor',r'\1\2emeanour'),
        (r'(.?)([Mm])ilitariz(.?)',r'\1\2ilitaris\3'),
        (r'(.?)([Mm])obiliz(.?)',r'\1\2obilis\3'),
        (r'([Dd])emocratiz(.?)',r'\1emocratis(.?)'),
        (r'([Dd])emoniz(.?)',r'\1emonis(.?)'),
        (r'(.?)([Mm])oraliz(.?)',r'\1\2oralis\3'),
        (r'(.?)([Nn])ationaliz(.?)',r'\1\2ationalis\3'),
        (r'([Dd])eodoriz(.?)',r'\1eodoris\2'),
        (r'(.?)([Pp])ersonaliz(.?)',r'\1\2ersonalis\3'),
        (r'([Dd])eputiz(.?)',r'\1eputis\2'),
        (r'(.?)([Ss])ensitiz(.?)',r'\1\2ensitis\3'),
        (r'(.?)([Ss])tabliz(.?)',r'\1\2tablis\3'),
        (r'([Dd])ialed',r'\1ialled'),
        (r'([Dd])ialing',r'\1ialling'),
        (r'([Dd])ialog( +)',r'\1ialogue\2'),
        (r'([Dd])ialogs( +)',r'\1ialogues\2'),
        (r'([Dd])iarrhea',r'\1iarrhoea'),
        (r'([Dd])igitiz(.?)',r'\1igitis\2'),
        (r'([Dd])isemboweled',r'\1isembowelled'),
        (r'([Dd])isemboweling',r'\1isembowelling'),
        (r'(.?)([Ff])avor(.?)',r'\1\2avour\3'),
        (r'([D,d])isheveled',r'\1ishevelled'),
        (r'(.?)honor(.?)',r'\1honour\2'), #making this recognize only lower-case h because of Honor Blackman and Honore
        (r'(.?)([Oo])rganization(.?)',r'\1\2rganisation\3'),
        (r'([Dd])istil( +)',r'\1istill\2'),
        (r'([Dd])istils',r'\1istills'),
        (r'([Dd])ramatiz(.?)',r'\1ramatis\2'),
        #(r'([Dd])rafts(.+)',r'\1raughts\2'), will need to do something else for draughtman, people
        (r'([Dd])rafty',r'\1raughty'),
        (r'([Dd])rafti(.?)',r'\1raughti\2'),
        (r'([Dd])riveled',r'\1rivelled'),
        (r'([Dd])riveling',r'\1rivelling'),
        (r'([Dd])ueled',r'\1uelled'),
        (r'([Dd])ueling',r'\1uelling'),
        #EEEE#
        (r'([Ee])conomiz(.?)',r'\1conomis'),
        (r'([Ee])dema',r'\1doema'),
        (r'([Ee])ditorializ(.?)',r'\1ditorialis\2'),
        (r'([Ee])mpathiz(.?)',r'\1mpathis\2'),
        (r'(.?)([Ee])mphasiz(.?)',r'\1\2mphasis\3'),
        (r'([Ee])nameled',r'\1namelled'),
        (r'([Ee])nameling',r'\1namelling'),
        (r'([Ee])namor(.?)',r'\1namour\2'),
        (r'([Ee])ncyclopedi(.?)',r'\1ncyclopaedi\2'),
        (r'([Ee])ndeavor(.?)',r'\1ndeavour\2'),
        (r'(.?)([Ee])nergiz(.?)',r'\1\2nergis\3'),
        (r'([Ee])nroll( +)',r'\1nroll\2'),
        (r'([Ee])nrolls( +)',r'\1nrols\2'),
        (r'([Ee])nrollment( +)',r'\1nrolment\2'),
        (r'([Ee])nthrall( +)',r'\1nthral\2'), #only enthrall is one l
        (r'([Ee])paulet( +)',r'\1paulette\2'),
        (r'([Ee])paulets',r'\1paulettes'),
        (r'([Ee])pilog( +)',r'\1pilogue'),
        (r'([Ee])pilogs',r'\1pilogues'),
        (r'([Ee])pitomiz(.?)',r'\1pitomis\2'),
        (r'([Ee])qualiz(.?)',r'\1qualis\2'),
        (r'([Ee])ulogiz(.?)',r'\1ulogis\2'),
        (r'([Ee])vangeliz(.?)',r'\1vangelis\2'),
        (r'([Ee])xorciz(.?)',r'\1xorcis\2'),
        (r'(.?)([Tt])emporiz(.?)',r'\1\2emporis\2'),
        (r'([Ee])xternaliz(.?)',r'\1xternalis\2'),
        #FFFF#
        (r'([Ff])actoriz(.?)',r'\1actoris\2'),
        (r'([Ff])eces',r'\1aeces'),
        (r'([Ff])ecal',r'\1aecal'),
        (r'([Ff])amiliariz(.?)',r'\1amiliaris\2'),
        (r'([Ff])antasiz(.?)',r'\1antasis\2'),
        (r'([Ff])eminiz(.?)',r'\1eminis\2'),
        (r'([Ff])ertiliz(.?)',r'\1ertilis\2'),
        (r'([Ff])ervor',r'\1ervor'),
        (r'([Ff])iber(.?)',r'\1ibre\2'),
        (r'([Ff])ictionaliz(.?)',r'\1ictionalis\2'),
        (r'([Ff])ilet(.?)',r'\1illet\2'),
        (r'([Ff])inaliz(.?)',r'\1inalis\2'),
        (r'(.?)([Ff])lavor(.?)',r'\1\2lavour\3'),
        (r'([Ff])etal',r'\1oetal'),
        (r'([Ff])etus(.?)',r'\1oetus\2'),
        (r'([Ff])etid',r'\1oetid'),
        (r'([Ff])ormaliz(.?)',r'\1ormalis\2'),
        (r'([Ff])ossiliz(.?)',r'\1ossilis\2'),
        (r'([Ff])raterniz(.?)',r'\1raternis\2'),
        (r'([Ff])ulfill( +)',r'\1ulfil\2'),
        (r'([Ff])ulfillment',r'\1ulfilment'),
        (r'([Ff])unneled',r'\1unnelled'),
        (r'([Ff])unneling',r'\1unnelling'),
        #GGGG#
        (r'([Gg])alvaniz(.?)',r'\1alvanis\2'),
        (r'([Gg])amboled',r'\1ambolled'),
        (r'([Gg])amboling',r'\1amboling'),
        (r'([Gg])eneraliz(.?)',r'\1eneralis\2'),
        (r'([Gg])hettoiz(.?)',r'\1hettois\2'),
        (r'([Gg])lamoriz(.?)',r'\1lamoris\2'),
        (r'([Gg])lamor( +)',r'\1lamour\2'),
        (r'([Gg])lobaliz(.?)',r'\1lobalis\2'),
        (r'([Gg])luing',r'\1lueing'),
        (r'([Gg])oiter(.?)',r'\1oitre\2'),
        (r'([Gg])onorrhea',r'\1onorrhoea'),
        (r'([Gg])raveled',r'\1ravelled'),
        (r'gray( +)',r'grey\1'), #probably shouldn't include cap G#
        (r'gray(.?)',r'grey\1'),
        (r'([Gg])roveled',r'\1rovelled'),
        (r'([Gg])roveling',r'\1rovelling'),
        (r'([Gg])rueling(.?)',r'\1ruelling\2'),
        (r'([Gg])ynacol(.?)',r'\1ynaecol\2'),
        #HHHH#
        (r'([Hh])ematolog(.?)',r'\1aematolog\2'),
        (r'([Hh])emo(.?)',r'\1aemo\2'),
        (r'([Hh])arbor(.?)',r'\1arbour\2'),
        (r'([Hh])armoniz(.?)',r'\1armonis\2'),
        (r'([Hh])omeopath(.?)',r'\1omoeopath\2'),
        (r'([Hh])omogeniz(.?)',r'\1omogenis\2'),
        (r'([Hh])ospitaliz(.?)',r'\1ospitalis\2'),
        (r'([Hh])umor(.?)',r'\1umour\2'),
        (r'([Hh])ybridiz(.?)',r'\1ybridis\2'),
        (r'([Hh])ypnotiz(.?)',r'\1ypnotis\2'),
        (r'([Hh])ypothesiz(.?)',r'\1ypothesis\2'),
        #IIII#
        (r'([Ii])dealiz(.?)',r'\1dealis\2'),
        (r'([Ii])doliz(.?)',r'\1dolis\2'),
        (r'(.?)([Mm])obiliz(.?)',r'\1\2obilis\3'),
        (r'([Ii])mmortaliz(.?)',r'\1mmortalis\2'),
        (r'([Ii])mmuniz(.?)',r'\1mmunis\2'),
        (r'(.?)([Pp])aneled',r'\1\2anelled'),
        (r'(.?)([Pp])aneling',r'\1\2anelling'),
        (r'([Ii])mperiled',r'\1mperilled'),
        (r'([Ii])mperiling',r'\1mperilling'),
        (r'([Ii])ndividualiz(.?)',r'\1ndividualis\2'),
        (r'([Ii])ndustrializ(.?)',r'\1ndustrialis\2'),
        (r'([Ii])nstill( +)',r'\1nstil\2'),
        (r'([Ii])nitialed',r'\1nitialled'),
        (r'([Ii])nitialing',r'\1nitialling'),
        (r'([Ii])nstallment(.?)',r'\1nstalment\2'),
        (r'([Ii])nstitutionaliz(.?)',r'\1nstitutionalis\2'),
        (r'([Ii])ntellectualiz(.?)',r'\1ntellectualis\2'),
        (r'(.?)([Nn])ationaliz(.?)',r'\1ationalis\2'),
        (r'([Ii])nternaliz(.?)',r'\1nternalis\2'),
        (r'([Ii])oniz(.?)',r'\1onis\2'),
        (r'([Ii])taliciz(.?)',r'\1talicis\2'),
        (r'([Ii])temiz(.?)',r'\1temis\2'),
        #JJJJ
        (r'([Jj])eopardiz(.?)',r'\1eopardis\2'),
        (r'([Jj])eweler(.?)',r'\1eweller\2'),
        #KKKK#
        #None known#
        #LLLL#
        (r'([Ll])abeled',r'\1abelled'),
        (r'([Ll])abeling',r'\1abelling'),
        (r'([Ll])ackluster',r'\1acklustre'),
        (r'(.?)([Ll])egaliz(.?)',r'\1\2egalis\3'),
        (r'(.?)([Ll])egitimiz(.?)',r'\1\2egitimis\3'),
        (r'([Ll])ukemia',r'\1eukaemia'),
        (r'(.?)([Ll])evele(.?)',r'\1\2evelle\3'),
        (r'(.?)([Ll])eveling',r'\1\2evelling'),
        (r'([Ll])ibeled',r'\1ibelled'),
        (r'([Ll])ibelous',r'\1ibellous'),
        (r'([Ll])ibeling',r'\1ibelling'),
        (r'([Ll])iberaliz(.?)',r'\1iberalis\2'),
        (r'([Ll])ioniz(.?)',r'\1ionis\2'),
        (r'([Ll])iquidiz(.?)',r'\1iquidis\2'),
        (r'([Ll])ocaliz(.?)',r'\1ocalis\2'),
        (r'([Ll])ouver(.?)',r'\1ouvre\2'),
        (r'( +)([Ll])uster',r'\1\2ustre'),
        #MMMM#
        (r'(.?)([Mm])agnetiz(.?)',r'\1\2agnetis\3'),
        (r'(.?)([Mm])aneuver(.?)',r'\1\2anoeuvre\3'),
        (r'([Mm])arginiliz(.?)',r'\1arginilis\2'),
        (r'([Mm])arshaled',r'\1arshalled'),
        (r'([Mm])arshaling',r'\arshalling'),
        (r'([Mm])arveled',r'\1arvelled'),
        (r'([Mm])arveling',r'\1arvelling'),
        (r'([Mm])arvelo(.?)',r'\1arvello\2'),
        (r'(.?)([Mm])aterializ(.?)',r'\1\2aterialis\3'),
        (r'([Mm])aximiz(.?)',r'\1aximis\2'),
        (r'([Mm])eager',r'\1eager'),
        (r'([Mm])echaniz(.?)',r'\1echanis\2'),
        (r'([Mm])emorializ(.?)',r'\1emorialis\2'),
        (r'([Mm])emoriz(.?)',r'\1emoris\2'),
        (r'([Mm])esmeriz(.?)',r'\1esmoris\2'),
        (r'([Mm])etaboliz(.?)',r'\1etabolis\2'),
        (r'([Mm])iniaturiz(.?)',r'\1iniaturis\2'),
        (r'([Mm])inimiz(.?)',r'\1inimis\2'),
        (r'([Mm])iter(.?)',r'\1itre\2'),
        (r'(.?)([Mm])odele(.?)',r'\1\2odelle\3'),
        (r'(.?)([Mm])odeling',r'\1\2odelling'),
        (r'([Mm])oderniz(.?)',r'\1odernis\2'),
        (r'([Mm])oisturiz(.?)',r'\1oisturis\2'),
        (r'([Mm])onolog( +)',r'\1onologue\2'),
        (r'([Mm])onologs',r'\1onologues'),
        (r'([Mm])onopoliz(.?)',r'\1onopolis\2'),
        (r'(.?)([Mm])old(.?)',r'\1\2ould\3'),
        (r'([Mm])olted',r'\1oulted'),
        (r'([Mm])olting',r'\1oulting'),
        (r'([Mm])olt( +)',r'\1oult\2'),
        (r'([Mm])ustache(.?)',r'\1oustache\2'),
        #NNNN#
        (r'([Nn])aturaliz(.?)',r'\1aturalis\2'),
        (r'([Nn])eighbor(.?)',r'\1eighbour\2'),
        (r'([Nn])aturaliz(.?)',r'\1aturalis\2'),
        (r'([Nn])eutraliz(.?)',r'\1eutralis\2'),
        (r'([Nn])ormaliz(.?)',r'\1ormalis\2'),
        #OOOO#
        (r'([Oo])dor( +)',r'\1dour\2'),
        (r'([Oo])dors',r'\1dours'),
        (r'( +)esophagus(.?)',r'\1oesophagus\2'),
        (r'( +)Esophagus(.?)',r'\1Oesophagus\2'),
        (u'( +)estrogen',u'\1oestrogen'),
        (u'( +)Estrogen',u'\1Oestrogen'),
        (r'([Oo])ffense(.?)',r'\1ffence\2'),
        (r'([Oo])melet( +)',r'\1melette\2'),
        (r'([Oo])melets',r'\1melettes'),
        (r'(.?)([Oo])ptimiz(.?)',r'\1\2ptimis\3'),
        (r'(.?)([Oo])rganiz(.?)',r'\1\2rganis\3'),
        (r'([Oo])rthopedic(.?)',r'\1rthopaedic\2'),
        (r'([Oo])straciz(.?)',r'\1stracis\2'),
        (r'([Oo])xidiz(.?)',r'\1xidis\2'),
        #PPPP#
        (r'([Pp])ederast(.?)',r'\1aederast\2'),
        (r'([Pp])ediatric(.?)',r'\1aediatric\2'),
        (r'([Pp])edo( +)',r'\1aedo\2'),
        (r'([Pp])edophil(.?)',r'\1aedophil\2'),
        (r'([Pp])aleo(.?)',r'\1alaeo\2'),
        (r'([Pp])anelist(.?)',r'\1anellist\2'),
        (r'([Pp])araliz(.?)',r'\1aralys\2'),
        (r'([Pp])arceled',r'\1arcelled'),
        (r'([Pp])arceling',r'\1arcelling'),
        (r'([Pp])arlor(.?)',r'\1arlour\2'),
        (r'([Pp])articulariz(.?)',r'\1articularis\2'),
        (r'([Pp])assiviz(.?)',r'\1assivis\2'),
        (r'([Pp])asteuriz(.?)',r'\1asteuris\2'),
        (r'([Pp])atroniz(.?)',r'\1atronis\2'),
        (r'([Pp])edestrianiz(.?)',r'\1edestrianis\2'),
        (r'([Pp])enaliz(.?)',r'\1enalis\2'),
        (r'([Pp])enciled',r'\1encilled'),
        (r'([Pp])enciling',r'\1encilling'),
        (r'([Pp])harmacopeia(.?)',r'\1harmacopoeia\2'),
        (r'([Pp])hilosophiz(.?)',r'\1hilosophis\2'),
        (r'([Pp])hilter(.?)',r'\1hiltre\2'),
        (r'([Pp])lagiariz(.?)',r'\1lagiaris\2'),
        (r'([Pp])low( +)',r'\1lough\2'),
        (r'([Pp])low(.?)',r'\1lough\2'),
        (r'(.?)([Pp])olariz(.?)',r'\1\2olaris\3'),
        (r'(.?)([Pp])oliticiz(.?)',r'\1\2oliticis\3'),
        (r'([Pp])opulariz(.?)',r'\1opularis\2'),
        (r'([Pp])ouf( +)',r'\1ouffe\2'),
        (r'([Pp])oufs',r'\1ouffes'),
        (r'([Pp])racticed',r'\1ractised'),
        (r'([Pp])racticing',r'\1ractising'),
        (r'([Pp])raesidium(.?)',r'\1residium\2'),
        (r'(.?)([Pp])ressuriz(.?)',r'\1\2ressuris\3'),
        (r'([Pp])retens(.?)',r'\1retenc\2'),
        (r'([Pp])rimaeval',r'\1rimeval'), #Correcting in favour of American spelling#
        (r'(.?)([Pp])rioritiz(.?)',r'\1\2rioritis\3'),
        (r'(.?)([Pp])rivatiz(.?)',r'\1\2rivatis\3'),
        (r'([Pp])roffesionaliz(.?)',r'\1roffesionalis\2'),
        (r'([Pp])rolog( +)',r'\1rologue\2'),
        (r'([Pp])rologs',r'\1rologues'),
        (r'([Pp])ropagandiz(.?)',r'\1ropagandis\2'),
        (r'([Pp])roselytiz(.?)',r'\1roselytis\2'),
        (r'([Pp])ubliciz(.?)',r'\1ublicis\2'),
        (r'([Pp])ulveriz(.?)',r'\1ulveris\2'),
        (r'([Pp])ummeled',r'\1ummelled'),
        (r'([Pp])ummeling',r'\1ummelling'),
        (r'([Pp])ajama(.?)',r'\1yjama\2'),
        #QQQQ#
        (r'([Qq])uarreled',r'\1uarrelled'),
        (r'([Qq])uarreling',r'\1uqarrelling'),
        #RRRR#
        (r'([Rr])adicaliz(.?)',r'\1adicalis\2'),
        (r'([Rr])ancor(.?)',r'\1ancour\2'),
        (r'([Rr])andomiz(.?)',r'\1andomis\2'),
        (r'([Rr])ationaliz(.?)',r'\1ationalis\2'),
        (r'(.?)([Rr])aveled',r'\1\2avelled'),
        (r'(.?)([Rr])aveling',r'\1\2avelling'),
        (r'(.?)([Rr])ealiz(.?)',r'\1\2ealis\3'),
        (r'(.?)([Rr])ecogniz(.?)',r'\1\2ecognis\3'),
        (r'([Rr])econnoiter(.?)',r'\1econnoitre\2'),
        (r'([Rr])efueled',r'\1efuelled'),
        (r'([Rr])efueling',r'\1efuelling'),
        (r'(.?)([Rr])egulariz(.?)',r'\1\2\egularis\3'),
        (r'([Rr])evele(.?)',r'\1evelle\2'),
        (r'([Rr])eveling',r'\1evelling'),
        (r'(.?)([Vv])italiz(.?)',r'\1\2vitalis\3'),
        (r'([Rr])evolutioniz(.?)',r'\1evolutionis\2'),
        (r'([Rr])hapodiz(.?)',r'\1hapodis\2'),
        (r'( +)([Rr])igor( +)',r'\1\2igour\3'),
        (r'([Rr])itualiz(.?)',r'\1itualis\2'),
        (r'(.?)([Rr])ivaled',r'\1\2ivalled'),
        (r'([Rr])ivaling',r'\1ivalling'),
        (r'([Rr])omanticiz(.?)',r'\1omanticis\2'),
        (r'([Rr])umor(.?)',r'\1umour\2'),
        #SSSS#
        (r'([Ss])aber(.?)',r'\1sabre\2'),
        (r'([Ss])altpeter',r'\1altpetre'),
        (r'(.?)([Ss])anitiz(.?)',r'\1\2anitis\3'),
        (r'([Ss])atiriz(.?)',r'\1atiris\2'),
        (r'([Ss])avior(.?)',r'\1aviour\2'),
        (r'(.?)savor(.?)',r'\1savour\2'), #recognizes only lower-case s, because of Gerald Savory
        (r'([Ss])candaliz(.?)',r'\1candalis\2'),
        (r'([Ss])keptic(.?)',r'\1ceptic\2'),
        (r'([Ss])cepter(.?)',r'\1ceptre\2'),
        (r'([Ss])crutiniz(.?)',r'\1crutinis\2'),
        (r'([Ss])eculariz(.?)',r'\1ecularis\2'),
        (r'([Ss])ensationaliz(.?)',r'\1ensationalis\2'),
        (r'([Ss])entimentaliz(.?)',r'\1entimentalis\2'),
        (r'([Ss])epulcher(.?)',r'\1epulchre\2'),
        (r'([Ss])erializ(.?)',r'\1erialis\2'),
        (r'([Ss])ermoniz(.?)',r'\1ermonis\2'),
        (r'([Ss])hoveled',r'\1hovelled'),
        (r'([Ss])hoveling',r'\1hovelling'),
        (r'([Ss])hriveled',r'\1hrivelled'),
        (r'([Ss])hriveling',r'\1hrivelling'),
        (r'([Ss])ignaliz(.?)',r'\1ignalis\2'),
        (r'([Ss])ignaled',r'\1ignalled'),
        (r'([Ss])ignaling',r'\1ignalling'),
        (r'([Ss])molder(.?)',r'\1moulder\2'),
        (r'([Ss])niveled',r'\1nivelled'),
        (r'([Ss])niveling',r'\1nivelling'),
        (r'([Ss])norkeled',r'\1norkelled'),
        (r'([Ss])norkeling',r'\1norkelling'),
        (r'(.?)([Ss])ocializ(.?)',r'\1\2ocialis\3'),
        (r'([Ss])odomiz(.?)',r'\1odomis\2'),
        (r'(.?)([Ss])olemniz(.?)',r'\1\2olemnis\3'),
        (r'([Ss])omber',r'\1ombre'),
        (r'([Ss])pecializ(.?)',r'\1pecialis\2'),
        (r'( +)([Ss])pecter(.?)',r'\1\2pectre\3'),
        (r'([Ss])piraled',r'\1piralled'),
        (r'([Ss])piraling',r'\1piraling'),
        (r'([Ss])plendor(.?)',r'\1plendour\2'),
        (r'([Ss])quirreled',r'\1quirrelled'),
        (r'([Ss])quirreling',r'\1quirrelling'),
        (r'(.?)([Ss])tabliz(.?)',r'\1\2tablis\3'),
        (r'(.?)([Ss])tandardiz(.?)',r'\1\2tandardis\3'),
        (r'([Ss])tenciled',r'\1tencilled'),
        (r'([Ss])tenciling',r'\1tencilling'),
        (r'(.?)([Ss])teriliz(.?)',r'\1\2terilis\3'),
        (r'(.?)([Ss])tigmatiz(.?)',r'\1\2tigmatis\3'),
        (r'(.?)([Ss])ubsidiz(.?)',r'\1\2ubsidis\3'),
        (r'([Ss])uccor(.?)',r'\1uccour\2'),
        (r'([Ss])ulfa(.?)',r'\1ulpha\2'),
        (r'([Ss])ulfi(.?)',r'\1ulphi\2'),
        (r'([Ss])ulfu(.?)',r'\1ulphu\2'),
        (r'([Ss])ummariz(.?)',r'\1ummaris\2'),
        (r'([Ss])wiveled',r'\1wivelled'),
        (r'([Ss])wiveling',r'\1wiveling'),
        (r'([Ss])ymboliz(.?)',r'\1ymbolis\2'),
        (r'([Ss])ympathiz(.?)',r'\1ympathasis\2'),
        (r'(.?)([Ss])ynchroniz(.?)',r'\1\2ynchronis\3'),
        (r'(.?)([Ss])ynthesiz(.?)',r'\1\2ynthesis\3'),
        (r'(.?)([Ss])ystematiz(.?)',r'\1\2ystematis\3'),
        #TTTT#
        (r'([Tt])antaliz(.?)',r'\1antalis\2'),
        (r'([Tt])asseled',r'\1asselled'),
        (r'([Tt])enderiz(.?)',r'\1enderis\2'),
        (r'([Tt])erroriz(.?)',r'\1erroris\2'),
        (r'([Tt])heoriz(.?)',r'\1heoris\2'),
        (r'([Tt])oweled',r'\1owelled'),
        (r'([Tt])oweling',r'\1owelling'),
        (r'([Tt])oxemia',r'\1oxaemia'),
        (r'([Tt])ranquiliz(.?)',r'\1ranquillis\2'),
        (r'([Tt])ranquilis(.?)',r'\1ranquillis\2'),
        (r'([Tt])ranquilliz(.?)',r'\1ranquillis\2'), #correcting archaic BrEng form to modern BrEng#
        (r'([Tt])ranquillity ([Bb])ase',r'Tranquility Base'), #correcting to IAU standard#
        (r'([Tt])ransistoriz(.?)',r'\1ransistoris\2'),
        (r'([Tt])raumatiz(.?)',r'\1raumatis\2'),
        (r'([Tt])ravelers',r'\1ravellers'), #other forms under "ravelled" above#
        (r'([Tt])ravelog( +)',r'\1ravelogue\2'),
        (r'([Tt])ravelogs',r'\1ravelogues'),
        (r'([Tt])rvializ(.?)',r'\1rivialis\2'),
        (r'([Tt])umor(.?)',r'\1umour\2'),
        (r'([Tt])unneled',r'\1unnelled'),
        (r'([Tt])unneling',r'\1unnelling'),
        (r'([Tt])yraniz(.?)',r'\1yranis\2'),
        #UUUU#
        (r'([Uu])nioniz(.?)',r'\1nionis\2'),
        (r'([Uu])ntrameled',r'\1ntramelled'),
        (r'(.?)([Uu])rbaniz(.?)',r'\1\2rbanis\3'),
        (r'(.?)([Uu])tiliz(.?)',r'\1\2tilis\3'),
        #VVVV#
        (r'([Vv])alor',r'\1alour'),
        (r'([Vv])andaliz(.?)',r'\1andalis'),
        (r'(.?)([Vv])aporiz(.?)',r'\1\2aporis\3'),
        (r'([Vv])apor( +)',r'\1apour\2'),
        (r'([Vv])apors',r'\1apours'),
        (r'([Vv])aporiz(.?)',r'\1aporis\2'), #Weirdly, words that have vapour as a root lose the cosmetic 'u' #
        (r'(.?)erbaliz(.?)',r'\1erbalis\2'),
        (r'([Vv])ictimiz(.?)',r'\1ictimis\2'),
        (r'([Vv])igor( +)',r'\1igour\2'),
        (r'([Vv])isualiz(.?)',r'\1isualis\2'),
        (r'([Vv])ocaliz(.?)',r'\1ocalis\2'),
        (r'([Vv])ulaniz(.?)',r'\1ulcanis\2'),
        (r'([Vv])ulgariz(.?)',r'\1ulgaris\2'),
        #WWWW#
        (r'([Ww])easeled',r'\1easelled'),
        (r'([Ww])weaseling',r'\1easelling'),
        (r'([Ww])esterniz(.?)',r'\1esternis\2'),
        (r'([Ww])omaniz(.?)',r'\1omanis\2'),
        (r'([Ww])oolen(.?)',r'\1ollen\2'),
        (r'([Ww])oolies',r'\1oollies'),
        (r'([Ww])ooly',r'\1oolly'),
        #XXXX#
        #None known#
        #YYYY#
        (r'([Yy])odeled',r'\1odelled'),
        (r'([Yy])odeling',r'\1odelling'),
        #ZZZZ#
        #None known#
        ],
    'exceptions': {
        'inside-tags': [
            'pre',
            'code',
            'nowiki',
            'hyperlink',
            'link',
            'comment',
            'center',
            'color',
            'captiontextcolor',
            'gallery'
            ],
        'category': [
            'spelling',
            ],
        'inside': [
            'Similarities in Proto-Cultural Artifacts',
            'Honor_Blackman',
            'Savory',
            'tachometer',
            'mileometer',
            'spectrometer',
            'diameter',
            'diameters',
            'pentameter',
            'pentameters',
            'chronometer',
            'chronometers',
            'geometer',
            'geometers',
            'Geometer',
            'rateometer',
            'rateometers',
            'Rateometer',
            'Rateometers',
            'Hydrokinometer',
            'hydrokinometer',
            'interferometer',
            'EMF meter',
            'parameter',
            'altimeter',
            'altimeters',
            'year meter',
            'yearometer',
            'Limiters',
            'Limitres',
            'parameters',
            'Graystark',
            'perimeter',
            'pretension',
            '{{color',
            '(color)',
            'Dougray',
            'arboretum',
            'Arboretum',
            '{{ColorIntLink',
            'Broadway Dance Center',
            'Good Neighbors',
            'stingray',
            'stingrays',
            'Valoran',
            'Pearl Harbor',
            'anagram',
            'Anagram',
            'anagrams',
            'Anagrams',
            'Previsualization}}',
            'behemoth',
            'Behemoth',
            'behemoths',
            'Yourfavoritemartian',
            'Music-a-grams',
            'anagrams',
            'Anagrams',
            'hologram',
            'Hologram',
            'Holograms',
            'holograms',
            'electrocardiogram',
            'electrocardiograms',
            'pentagram',
            'pentagrams',
            'telegram',
            'telegrams',
            'transgram',
            'transgrams',
            'Transgram',
            'Transgrams',
            'diagram',
            'diagrams',
            'Diagram',
            'Diagrams',
            'engram',
            'engrams',
            'Grigory',
            'Unauthorized Guide',
            'Honor Blackman', #this isn't being excpted and i don't know why
            'Medal of Honor',
            'Arborge Quince',
            'program', #need a forum discussion here
            'programs',
            'reprogram',
            'deprogram',
            'pictogram',
            'pictograms',
            'Pictogram',
            'Pictograms',
            'phonogram',
            'phonograms',
            'background-color',
            'color:',
            'color :',
            'border-color',
            'text-align: center;',
            'text-align:center;',
            'align=center',
            'align = center',
            'align= center',
            'align =center',
            'position=center',
            '<center>',
            '</center>',
            '</ center>',
            'Encyclopedia of Fantastic',
            'themonster',
            'arboreal',
            'Moldova', #Doesn't make an exception and I don't know why
                   ' Moldova', #this DOES work.  very weird.
            'Fun at the Funeral Parlor',
            'humorous',
            'Humorous',
            'limiter',
            'appalling',
            'appalled',
            'Splendorosa',
            'Demeter',
            'cemetery',
            'Cemetery',
            'Gerald Savory',
            'Savory',
            'Johnson Space Center',
            'Kennedy Space Center',
            'Center',
            '[[Catalog]]',
            'Chilitern',
            'chemotherapy',
            'Chemothreapy',
            'Colorado',
            'previsualization',
            'Scarborough',
            'Akoshemon',
            'Plowman',
            'torpedo',
            'torpedos',
            'Torpedo',
            'Torpedos',
            'stingray',
            'stingrays',
            'Stingray',
            'Stingrays',
            'Beccy Armory',
            'Honore', #None of these attempts to 
            'Honoré', #except Honoré Lechasseur works
            'Honoré', #The bot has been made to not correct
            'Honore', #for capital-H Honor
            'lightsaber',
            'Polygram',
            'Majestic Theater',
            'Taplow',
            'Fyodor',
            'Target Practice',
            'target practice',
            'Synthesizing_Starfields', #doesn't appear to work
                   ' Synthesizing Starfields', #this does work
            'Pearl_Harbor',
            'Mercury Theater',
            'Event Synthesizer',
            'Emergency Program One',
            'bgcolor',
            'shield-o-gram',
            'dayofthemoon',
            'blasphemous', #dunno why this is being triggered as blasphaemous
            'grams operator', #not sure this is a real word, but it appears on DMP
            ],
        }
    }


Explanation[]

Um, it just works. It's a sort of monstrous amount of code to explain in detail, but he basic philosophy is to look for the root words that have trans-Atlantic spelling differences and to add regex that will allow for all permutations of that word. Take for instance these two:

        (r'(.?)([Pp])olariz(.?)',r'\1\2olaris\3'),
        (r'(.?)([Pp])oliticiz(.?)',r'\1\2oliticis\3'),

The notion here is that politicize would not likely be at the start of a sentence, so it's hard to imagine when you'd need to account for a capital P. But politicizing — because it's a noun as well as a verb — might very well start a sentence. And, it's possible, in the science fictional world of Doctor Who, for something to get "depolarized", or to have a "depolarising effect".

So the root here is olariz and oliticiz, cause we're trying to change -iz into -is.

We include ([Pp]) in our search, order to allow for finding either polarise or Polarize. We also look for anything after the -iz to allow for -ization and -izing. But we add a question mark, because we don't want it to be a greedy search. We only want to replace to the end of the word. And we add on (.?) at the beginning so that we only go back before the p as far as the space on the left side of the word.

That gives us all we need to put together the British spelling out of:

  • Group 1 - (.?), the prefix
  • Group 2 - the lowercase or uppercase p
  • the root of olaris or oliti cis
  • Group 3 - (.?) the suffix

Known limitations[]

Spelling is not an exact science. This will never be a user-fix that you can run automatically, simply because American spellings, like license and tire, are sometimes correct in British English. So you have to run this script manually in order to understand the intended context. But the more you run it on your site, the faster it'll go for you.

Some people will doubtless quibble about the spellings that have been chosen as "correct". There is considerable variability of opinion as to how the majority of Britons spell certain words. Words like instal, travelling In some cases the code simply closes its eyes and chooses.

May not be out-of-the-box ready for your wiki[]

Because this fix was made for, and roadtested at, a site devoted to Doctor Who, its exceptions are clearly biased towards words that occur in lterature surrounding that programme. This tends to make the exceptions quite fanciful — in many cases, the exceptions aren't even common nouns, but proper names — often of things that don't actually exist.

If you're running a wiki about a completely different subject, the included exceptions list probably won't be helpful to you. It's unlikely your wiki mentions "Gerald Savory", "Beccy Armory" and someone named "Honoré".

You'll need to run the fix a few times to discover what your own excptions are.

Advertisement