Friday, August 2, 2013

Proust graphs

I recently finished reading the 2002 English translations of the seven volumes of Marcel Proust's In Search Of Lost Time.  My pattern was generally to interleave another book or two between the readings of the Proust volumes, to avoid burnout and allow some time for thought.  The overall book is immense, with something like 1.3 million words.  Depending on the printing and the translation, that works out to about 4000 pages.  In addition to its length, I found it not to be a book that I could read much more than twenty or so pages of at one sitting.  The writing is dense and Proust tends to dwell on ideas (about art, emotions, memory, whether his girlfriend is a secret lesbian, and many other things) for lengthy sections.  It would get my mind going.  When my daughter was young and many nights were spent reading to her, I developed the strange ability to read while paying absolutely no attention to what I was reading.  I could read a kid's book aloud, with proper inflections and everything, while thinking of something completely different, leaving with no memory of what I had read.  I didn't read Proust aloud, but I did find myself occasionally reading a paragraph, then drifting into thought about what that all meant, but unfortunately continuing to read the next paragraph without attention.  Then I'd realize what had happened and have to retrace and reread.  Further, Proust rather famously wrote very long sentences, with many dependent clauses. The plot, to the extent that there is one, moves very slowly.  All of that meant that my reading pace was much slower than for most other books I read.

That doesn't look much of a sales job.  To make it sound even less appealing, most of the characters are not very good, likable people, with the first person narrator no exception.  There are many romantic and sexual relationships described, and not one of them is healthy and loving.  Proust was himself gay, but the depiction of homosexuality is not one a 21st century tolerant liberal would produce.

Despite all of that, the book is profound and moving.  I won't say life-changing, but it's a book that I'll think about and be influenced by for a long time.

The narrative flow of the many characters and their interrelationships have an intricate structure in the novel.  The only one who is present throughout is the narrator (called Marcel a couple of times but mostly nameless).  Early on, Charles Swann is a main focus.  There's a section in the first volume, Swann's Way, in which the narration departs from the first person for the only time, to relate the story of the romance of Swann and Odette, a relationship whose obsessiveness and jealousy are later mirrored in several of the narrator's own relationships.  Swann then retreats in importance, and dies halfway through the greater novel, though what happens to his memory and reputation is a key theme.  Another main character, Baron de Charlus, is barely present early on, then dominates for awhile, then retreats again.  The great romance of the narrator is with Albertine, who does not really show up until the second volume.  Side note: the two main volumes dealing with the Albertine relationship, The Prisoner and The Fugitive, are widely considered the least pleasant to get through.  That was the case for me.  It's such a deranged relationship.  Almost all that Marcel cares about is keeping Albertine away from lesbian affairs, which he does very unsuccessfully.  It ends up being very repetitive.  One wishes he would get on with leaving Albertine, as he threatens to do innumerable times.  These two volumes and the final Time Regained were published posthumously.  It is very likely that Proust would have done some major editing work on them had he lived.  Complicating things further, Proust based Albertine on his chauffeur Alfred Agostinelli, with whom he had a long affair, and who also had relationships with women.  There's a prominent literary theory that other women (Gilberte, Andrée, Lea) in Lost Time are actually based on men also.

To visualize the comings, goings, and relationships in Lost Time, I turned to programming and graphing.  The entire novel is freely available in plain text format from a variety of sources.  The available translation is the original Moncrieff rather than the recent one I read, but I think the methods I use would not show much difference.  The first step was to strip all punctuation out of the text and leave one word on each line.  The next was to identify all the ways in which a character is named in the novel.  For instance, Swann is identified as Swann, M. Swann, Charles, and the possessives of all three.  But the word Swann can also mean his wife or daughter, so Mme Swann, Mlle Swann, Odette Swann, Gilberte Swann must be taken out of his column.  Further, there are a few other Charleses mentioned (Charles Morel and a few kings), which must also be removed.  Finally, Charles Swann would be double counted (Charles and Swann), and such double counts must be accounted for.  I did this for all the major characters, and ended up with a list of every location (word number) in the novel where each character is named.  The giant exception to this is the narrator himself, who is constantly there except for the Swann In Love section.  I actually did count all locations of I, me, my, mine, etc., but there wasn't any good way of separating the personal pronouns from instances when other characters used them about themselves.  And in any case, it wouldn't have told me much except that it's a first person novel.  I decided to do a running count of the number of hits on a name per thousand words, as a way of smoothing the results a bit.  Here is what comes out for the character of Swann (click to enlarge):

The thick vertical lines are the breaks in the volumes of the novel.  The thinner lines are section breaks within each volume.  So one can see that Swann appears early on in the first section of the first volume.  This is the Combray section, and Swann is invited to dinner with the narrator's parents, which leads to the narrator's mother not coming up to kiss him good night.  He then leaves for awhile, and comes back to dominate the Swann In Love section.  In Shadow of Young Girls, he is present in the first section, in which the narrator is attempting to gain admittance into his house so as to pursue a friendship with Odette and a relationship with Gilberte.  Swann is then only mostly referred to occasionally, reappearing dramatically at the end of Guermantes Way, in which he shows up at the house of his friends the Duc and Duchesse de Guermantes to inform them that he is dying.  (They don't really care, being concerned about getting dressed for a party.  This is Proust skewering the vestigial nobility of Third Republic France.)    Swann is later said to have died.  This report comes from an offhand remark by another character, which shows the way in which Swann is already fading in the memory of society, having failed to achieve any immortality, because he threw away his artistic gifts.  Afterward he is only occasionally referred to, mostly through his still living wife and daughter, who abandon the Swann name and legacy, leaving him forgotten.  As Swann is one of the more admirable characters, this is all pretty bleak.

Next I wanted to see how interrelated some of the characters were, through the simple method of determining how close the mentions of two characters tended to be.  For each pair of characters A and B, for each mention of that character the closest mention of the other character is found (forward or backward).  The average distance can then be computed.  This can be done in each direction--the average distance of a B mention for each A mention, and vice versa.  For instance if A is mentioned consistently throughout the book and B only one time, from B's perspective A is close (the one mention of B is close to one of the many mentions of A), but from A's perspective B is not close (most mentions of A are nowhere near the sole mention of B).  So I made both calculations and decided the overall distance between the two is the average of the two numbers.  I had to do it this way if I wanted to map the distances for more than two characters.

To make a 2-D map, I reached for some optimization techniques.  I used a multidimensional scaling method.  Basically this method takes the desired distance between each pair of points, and attempts to place those points on a 2-D surface such that each is the proper distance from the other.  Most times, it is impossible to get them all exactly right.  As a trivial example, if you have three points A, B, and C, and AB is 10, BC is 10, and AC is 30, it's physically impossible to place them that way.  If you get AB and BC right, the maximum mapped AC (putting ABC on a straight line) is 20.  Another trivial example is that you can not place four points equidistant to one another on a plane.  (You can in 3-D: make a tetrahedron.)  So the method attempts to do the best possible job, moving points around until the tension (a measure of how close the mapped distances are to the desired distances) is minimized.  That will generally mean accepting small errors on each distance.  If tension is below 0.15, that is generally considered good enough.  All the maps I will present here have tensions below that.  The basic method places the points at random on the map, then nudges them a small amount.  If the nudging improves the tension score, repeat, if not, nudge them a different direction.  I added the twist that if the method stalls out with an unacceptable tension score, blow it up and place the points in a different random location, then start again.  In evolutionary terms, this allows saltation as a solution to leap from one dead end local best solution to a higher peak farther away.  Eventually it settles on a best solution, though the actual map can vary each time I run it, as the initial random guess will be different.

Here is a sample of how the method works, showing all the women to whom Marcel is attracted. Mlle Stermaria is a very minor character, and the other four are recurrent throughout the novel, with Albertine getting a bit more ink than the Odette, Gilberte, and the Duchesse de Guermantes.  Mlle Stermaria's few mentions are mostly in volumes where Albertine is also present.  She appears in the beach town of Balbec, where Marcel meets Albertine and her friends in Shadow of Young Girls.  In Sodom and Gomorrah, Marcel is planning a dinner with and seduction of Mlle Stermaria.  Meanwhile Albertine shows up and Marcel attempts to be rid of her so he can make his date.  So Mlle Stermaria is fairly far away from everyone but closest to Albertine.  Albertine herself generally dominates the two volumes about her so is a little distant from the other three.  The relationships among Odette, Gilberte, and the Duchesse change as the novel progresses.  Early on, Gilberte is a child and is therefore near her mother, Odette.  The Duchesse is Swann's friend and will receive him, but not his wife Odette, who has a scandalous past as a courtesan.  Nor will she receive Gilberte as she grows up, until she inherits a great deal of money and marries the Duchesse's nephew, Robert de Saint-Loup.  At that point (late in The Fugitive), we see Gilberte and the Duchesse get mentions in close proximity to one another.

 
Here is the distance map which results.  The size of the circle is proportional to the number of mentions of each character.
 Gilberte's closeness to the Duchesse relative to her own mother could be seen as a sort of triumph for Swann, who wished his daughter to be accepted in the highest of society, represented by the Duchesse early on.  However, Gilberte only manages this after Swann's death, and by taking the name of his rival, the Baron de Forcheville, after he marries Odette.  Swann himself is forgotten and renounced, but his daughter ,and in the final volume, his granddaughter, achieve social success.

I'll do several other groupings of characters in posts to come.

No comments:

Post a Comment