PHP Function: Latin-em
There are many instances where a word or phase, or an array of such, require special treatment. Case in point: Latin names, specifically, for this example, insect names. In the scientific community insects are not regularly called “bugs.” The have a taxonomic binomial (two names) to which they are referred. This is a worldwide practice. This ensures that entomologists the world-over stay on the same page. It doesn’t always work as I can think of several deviations from this rule, but it is the rule.
The Treatment
So, let’s use an insect that we can all relate to and give it the “treatment.” Let’s use the convergent ladybird beetle or ladybug as it’s commonly called. The binomial for the insect is Hippodamia convergens. The first name, Hippodamia, is the genus, and the second name, convergens, is the species. And the treatment? Well it goes like this…
Latin Binomial Treatment
- The first occurrence in a section or chapter, requires the full binomial be written. (Both names.)
- Occurrences thereafter must bear the first initial of the genus and the full species name.
- The binomial must be italicized because it is Latin.
- The genus is capitalized, the species is not
There are other rules, too, regarding crediting the binomial to its namer. “Linnaeus” is an example. His name would follow the name — not italicized — the first occurrence would bear the name, and subsequent occurrences would bear the initial. In this case it’s “L.” There are other rules as well, but the critical ones are really the in the ordered list above.
All of these rules are par for the course for your average science writer, and they are all very easy to deal with desktop publishing programs such as PageMaker, Quark, In Design, Word (perhaps). There are two things to do: Write it right and italicize, and the latter is really simple to do for on-paper production. But on the web the rules change.
If we could only deal with plain text all would be right as rain. We can. The binomial use, the capped genus initial, all so easy. Italicizing though, that’s harder. Now, first I must say I hand code only. Perhaps one of the popular WYSIWYG web-building programs out there have a function to deal with italicizing to the extent I’m thinking of. I really don’t know. But I do know the hand-coder is pretty much cut-n-paste restricted. But that’s okay, nobody wants to really clutter up that nice clean text with all that HTML anyway. Right? Nah, let the server do it. Specifically, let’s use PHP to get the job done.
Beginning the process
Let’s define the job. First we must use the <i><i>
attribute. One may think, just use CSS and a span class to get the job done. But you can’t. There are two reasons. The first is for when the CSS isn’t available. The binomial still needs to be italicized. The second reason is that the language declaration for each binomial should be present. In this case, being Latin, we’d use lang="la"
.
Using the beetle example I provided above, the problem HTML to write would be as follows:
<i lang="la">Hippodamia convergens<i>
And the resulting output to the page:
Hippodamia convergens
Simple. What’s the problem. Lazy? Not lazy. Do it a thousand times on one site, and you’ll better appreciate what a pain it can be. There’ll be others words to use in lieu of “simple,” in fact, but they can’t be posted here.
Okay, now let’s make PHP process good old plain text (with a little mark-up, keep reading) and put in the HTML for us at the server level immediately before it delivers the page requested by a site visitor. Wow, that sounds cool. Science writers — the webby ones, anyway — would really like this I bet. I know I do. At the time of this writing I am building a huge website with an extensive list of Latin binomials used repeatedly, all needing the Latin’em function. So much so that the script described herein will become hot to the touch from so much use.
So, without further ado, let’s ado some PHP.
Let’s make an array of the Latin names which will appear in our content. For this example I will use the beetle name given above and I will add two more. First a parasitic nematode, Heterorhabditis bacteriophora, and then a aphid-parasitic mini-wasp called Aphidius matricariae. So what do we have for our array? Remembering the rule of section or chapter occurrences and the genus capitalization, we’ll have six names in our array:
- Hippodamia convergens
- H. convergens
- Heterorhabditis bacteriophora
- H. bacteriophora
- Aphidius matricariae
- A. matricariae
Okay, so here’s the start of the function’s process and the array of constant values:
<?php // Name the function and kick it off function latinem($latin){ // This brings the list together as a string of elements rockin' and ready $latin = implode(", file($latin)); // The list needs to be defined. What's being brought together $latins = array( // Hmm, Latins, there must be more than one // Our predetermined binomial values list begins ‘Hippodamia convergens’, ‘H. convergens’, ‘Heterorhabditis bacteriophora’, ‘H. bacteriophora’, ‘Aphidius matricariae’, ‘A. matricariae’, // Our predetermined binomial values list end ); // foreach handles multiple occurrences, the values become “word” // And one of those Latins is going to be THE “word” foreach ($latins as $word){ // The new word value is run through the guts of the process // The string is replaced and our “word” is tagged $latin = str_replace($word, ‘<i lang="la">’.$word.’</i>’, $latin); } return $latin; // This is the output. We just need some text to process through the function to see if there are matches. } ?>
Okay, now that this is done, we need to save the script above in a file called, um, let’s say “latinem.php.” Now, the rest of the work is done on the actual web page. First we need to include the script. I placed mine between the </head>
and <body>
using an include function.
<?php include("latin_em.php"); ?>
Easy enough, right? One step left, we have to put the function call on the page. This will serve as an outlet of sorts allowing the “treated” text to flow onto the page like water out of a hose. Now do bear in mind the content is echoed on to the page. The actual text will reside in a plain text flat file — though it could be drawn from a database such as MySQL. It would be necessary to write some basic mark-up in the text file and this is fine. This is how it might be done if the file was CMS-produced. The basic mark-up? That’d be <p></p>
predominantly, but there are really no limits aside from one (the caveat): You will not be able to further process PHP within that text because it’s already been processed and output directly to the page the way this is done in this example, but you can, however, run the text through multiple functions if the implementation is the same (I’ll show an example at the end). I’m sure the script could be modified to, say, pull includes into the text (though you’ll want to make that text file a PHP file if you do).
A very basic example of a web page follows. In it I will show where you might place the function call.
<html> <head> <title>Your Page Title</title> <!-- Meta data and your links, etc. --> </head> <body> <h1>Your Top Level Heading</h1> <h2>Your Second Level Heading</h2> <h3>Your Third Level Heading</h3> <!-- Okay, echo time. The text is marked-up with in paragraphs already --> <?php echo latinem("yourfile.txt"); ?> </body> </html>
That’s it. Save the file and the applicable binomials that we predefined, provided they’re present in the echoed file (yourfile.txt) should be “treated.”
Taking it further
Obviously, this script could be used to treat other “words” a different way. You could even echo links from plain text by simply putting the correct HTML in the script. So, you could make the script take every occurrence of certain difficult words and turn them into links to your site glossary, for example. It should be self-evident on how to modify this script so I’ll not expound further.
Let’s say, perhaps, that we want to run the same text file through this new pseudo-function that I described above, in brief. We’ll not only tag our Latin binomials, but we’ll also make some glossary page links We can the function “linker.” Having done this now, let’s link that script by including it near the top of the page as we did for “latinem.php.” Now we have to make sure the text flows onto the page through both functions, so to speak. To make this so we nest the functions:
<?php echo (latinem(linker("yourfile.txt")); ?>
That’s it. We’re done. Questions, comments, or suggestions on how to better approach this function’s action, please post them here. Forgive my weak explanations of the operators and steps and whatnot. I’m still learning. I just felt like sharing some homegrown… PHP.
Author’s note:
It’s not what one would call an elegant script or anything like that I don’t think. It is, however, quite useful and if not used in the way I am using it, I’m sure it could be put to some creative use. Have fun.I’m fairly new to PHP and thus needed a little help to get this %$@#!* thing working and thus I would like to thank Jonathan Fenocchi for the assistance he offered me. Thanks buddy. I could have done it without your help, but it might have taken me weeks.
Mike
Jonathan Fenocchi responds:
Posted: June 4th, 2005 at 1:41 am →
Hey Mike, nice follow-up. I posted an article on a more stable (and less unpredictable) way to do this, and also in the comments section another method for doing this was pointed out.
Mike Cherim responds:
Posted: June 4th, 2005 at 8:23 am →
Thanks Jonathan,
Man, it didn’t take you long to find this article, lol. It was around 3:30 am when I published it.
Yeah, I did play with
ob_start
andob_flush
(though that might have been on a security issue on a different project, not sure). I do remember it giving me a lot of grief. I did note that article you wrote before — and the response — and will have to revisit this. We should talk more on this.