========================================================================= Date: Fri, 19 Nov 1993 02:13:25 CST Reply-To: Richard Giordano Sender: "TEI-L: Text Encoding Initiative public discussion list" From: Richard Giordano Subject: Lectureship Advertisement Greetings fellow TEI folks: I would urge all of you you do interesting work in natural language processing, advanced text processing, very large database design, etc. to consider this lectureship, and to contact me if you have any questions. We are a large and interdiscipinary department, and you don't necessarily need a Computer Science PhD to get a job here. I should know, my background is in American Intellectual History (though, in truth, I do have an MS in information science). I think it's a good place to work. If you want to ask some information questions, please contact me at rich@cs.man.ac.uk. /rich > ---------------------------------------------------------------------- > LECTURESHIP in COMPUTER SCIENCE > > DEPARTMENT of COMPUTER SCIENCE > FACULTY of SCIENCE > UNIVERSITY of MANCHESTER > > > > Applications are invited for a Lectureship in the Department of > Computer Science within the Faculty of Science at the University of > Manchester. > > Applications are especially welcomed from researchers who have made a > promising start towards a career in any area of Computer Science and > who are (expected to become) influential members of the international > research community. Suitable applicants will have already published > promising results. The area of expertise is less important than the > quality of work done to date. > > The Department is one of the top rated research departments in > Computer Science in the UK and offers excellent computing and > laboratory facilities in support of research and teaching. > > The University is an Equal Opportunity employer and Job-Share > applications are welcomed. > > > The appointment is tenable immediately. Salary will be on the scale > L.14,396 to L.18,855 (Lecturer Grade A) according to qualifications > and experience. > > Further particulars and application forms may be obtained from The > Registrar, University of Manchester, Oxford Rd, M13 9PL to whom formal > application should also be made. > > Closing Date: Friday 26th November. > ---------------------------------------------------------------------- > > ========================================================================= Date: Tue, 23 Nov 1993 14:44:40 CST Reply-To: Lou Burnard Sender: "TEI-L: Text Encoding Initiative public discussion list" From: Lou Burnard Subject: Report on WWW/TEI Meeting (Cork, 18-21 Nov) As originally proposed at ACH-ALLC in Washington earlier this year, Peter Flynn of the Curia Project at the University of Cork organized a two day meeting with a general view of creating dialogue between the TEI and the developers of World Wide Web, one of the most rapidly growing computer systems since the Internet itself. WWW is a distributed hypertext system running at some improbably large number of sites worldwide, which uses a very simple SGML tagset called HTML (it has been rather unkindly characterized as "Pidgin-SGML"). WWW itself consists of a markup language (HTML), a set of Internet protocols (FTP, HTTP etc) and a naming scheme for objects or resources (the "Universal Resource Locator" or URL). A number of browsers are now available which use these components. Mosaic, developed at NCSA, is probably the most impressive: running on Mac, X and Windows it offers a fully graphical interface with just about everything current technology can support. Lynx, developed at the Computer Science dept at U of Kansas, is at the opposite extreme, assuming only a VT100 (there is also a WWW-mode for EMACS!). I will not attempt here to describe WWW in operation. Web browsers are freely available by anonymous FTP all over the place: if you haven't tried it out already, and can't see what all the fuss is about, then you should stop reading now, get yourself a browser and do so forthwith. The two day meeting was attended by Chris Wilson (NCSA); Lou Montulli (Lynx, U Kansas); Bill Perry (EMACS, Indiana University); Dave Ragget (Hewlett-Packard; HTML+) and myself for TEI. Various representatives of the Curia project, notably Patricia Kelly from the Royal Irish Academy, were also present. I gave a short presentation about the TEI, focussing mostly on contextual issues but also including some detailed technical stuff about bases and toppings and X-pointer syntax, which seemed to be well received. Dave Ragget then talked us through the current HTML+ draft which started off a very wide ranging discussion. This continued during the second day of the meeting, but was at least partially nailed down in the shape of a brief report (see below) which should be somewhere in the Web by the time you read this one. To their credit, most WWW people seem painfully aware of the limitations of the current HTML specification, which was very much an experimental dtd hacked together in haste and ignorance of the finer points of SGML. (or indeed the blunter ones). HTML+, which Dave Raggett has been working on for the last year or so, attempts to improve on it without sacrificing too much of its flexibility. This draft will eventually progress to Internet RFC status; there is also talk of an IETF working group co-chaired by Ragget and Tim Berners-Lee (of CERN; onlie begettor of the Web) to steer this process through. The Cork meeting was an interesting opportunity for the developers of three of the major Web browsers to meet face to face and argue over some of the design decisions implicit in the HTML+ spec. To some extent this did happen, though the discussion was rather anarchic and unstructured. It was also a good opportunity for the TEI to encourage development of HTML+ in a TEI convergent manner, and this I think was achieved. Several of the changes accepted, at least in principle, will make it much easier to transform TEI documents into HTML, if not vice versa. Some practical issues about how WWW should handle TEI conformant documents were also resolved. Outside the meeting, this was also a good opportunity to find out more about the Curia project itself. My hasty assessment is that this project has still some way to go. There is a clear awareness of the many different ways in which it could develop, and a tremendous enthusiasm. I think the project would benefit from some detailed TEI consultancy before too much more P1-conformant material is created. It also offers interesting contrastive opportunities with other corpus-building activities, chiefly because of its enormous diachronic spread, and its polyglot nature. Lou Burnard, Cork, 21 Nov 93 ========= Concluding statement of the WWW/TEI Meeting follows ========== WWW/TEI Meeting

Notes from WWW/TEI Meeting

Action Items/Recommendations

  • HTML 1.0 should be documented to define the behavior of existing browsers, and should be frozen as agreed upon at the WWW Developers' Conference.
    • Features to be documented, implemented and specified include collapsing spaces, underline, alt attribute, BR, HR, ISMAP...
    • HTML IETF spec needs to be updated by CERN, as well as existing documentation
  • HTML+ future browsers need not support HTML 1.0 features after a reasonable amount of time. As an aid in transition, the HTML+ spec/DTD will not include any deprecated features of HTML 1.0.
    • HTML 1.0 deprecated features
      • nextid
      • method, rel, rev, effect from <A> tag (but not from the <LINK> tag)
      • blockquote --> quote
      • There was a feeling that the <img> tag will be superceded by the <fig> tag, although its deprecation was not agreed upon.
      • menu list --> ul
      • dir list--> ul
  • The intention of HTML+ is to support generic SGML-compliant authoring tools, and authors are recommended to use this software with the HTML+ DTD for the creation/maintenance of documents.
  • Browsers may implement different levels of HTML+ conformance.
    • Level 0 implementation
      • HTML 1.0 spec referenced above
    • Level 1 implementation
      • Partial fill-out forms
      • New entity definitions (in section 5.1 of HTML+ draft)
    • Level 2 implementation
      • Additional presentation tags (sub, sup, strike) & logical emphasis
      • Full forms support (incl. type checking)
      • Generic emphasis tag
    • Level 3 implementation
      • Figures
      • NOTEs and admonishments
    • Further levels to be specified
  • Authoring tools are expected to conform to the HTML+ DTD and are NOT to support deprecated features.
  • We expect the HTML+ DTD to be developed incrementally. The HTML+ internet draft will make clear which features are now stable and which are still subject to change. The DTD will be structured to reflect this.
    1. HTML+ will work with the SGML reference concrete syntax.
    2. The entity sets will be user-specifiable (in the long run).
    3. HTML+ will support nested divisions or containers.
    4. There will be a number of new features
      Figures & Images
      <fig> may be able to subsume the role of <img>.
      Generic highlighting tag
      The <em> tag will be used with a set of three or four defined attributes to present a guaranteed-distinct presentation of these attributes.
      Generic roles
      Support for undefined elements (user extensions) (render)
      Tables
      This is now stable.
      Math
      for research
  • HTML/TEI
    • It was felt the correct way to convert between TEI and HTML was to do it on the server side using a conversion filter.
    • This server will also provide a hypertext link to download the raw TEI text.
    • We (WWW developers and TEI people) will strive together to converge functionality between HTML* and TEI, as well as to produce this server/filter system.
  • Links to:
========================================================================= Date: Fri, 26 Nov 1993 10:26:25 CST Reply-To: "C. M. Sperberg-McQueen" Sender: "TEI-L: Text Encoding Initiative public discussion list" From: "C. M. Sperberg-McQueen" Organization: ACH/ACL/ALLC Text Encoding Initiative Subject: new chapter: verse * * * * * * * * * * * * * * * * * * TEI P2 * * new fascicle now available * * Chapter VE * * Base tag set for Verse * * * * * * * * * * * * * * * * * * A new chapter of TEI P2, chapter 9 (VE), defining a base tag set for Verse, is now available for public comment. As readers of this list will recall, TEI P2 is the second draft of the TEI Guidelines for Electronic Text Encoding and Interchange, and is being distributed for comment chapter by chapter, as and when the chapters are ready for comment. (File TEI ED J8, "Obtaining the Second Version of the TEI Guidelines," has the details, if you have forgotten). Chapter 9 (known internally as 'VE'), defines tags for a more detailed encoding of verse material than is possible with the core tag set alone. The core defines tags for verse lines and line groups; the base tag set defined in this chapter adds: - tags for hierarchically arranged line groups (LG1, LG2, ... LG5) similar in concept and usage to the 'numbered' DIV elements found in the default text structure tags - methods for using the generic segmentation element (SEG) for analysis of metrical structure below the line level - a CAESURA element for marking caesuras - attributes (attached to SEG, L, LG, LG1 etc., the DIV elements, and BODY) for recording rhyme pattern, metrical pattern, and concrete realization of the metrical pattern. These attributes can take user-defined notations for the information involved; a default system (the conventional 'abab' notation) is defined for the RHYME attribute; no default is defined for the MET and REAL attributes. - examples of the metrical, prosodic, and rhyme-scheme notations being applied to stichic and stanzaic verse - an example of the LINK and LINKGRP elements defined in chapter SA (on segmentation, alignment, and linking) being applied to the encoding of line-internal rhyme Together with this chapter, the following DTD files are being released: teivers2.ent, which defines special modifications to the TEI class system; teivers2.dtd, which provides element and attribute list declarations for all elements in the tag set, and teivegis.dtd, which defines the parameter entities used for the generic identifiers in the tag set. New versions of the core DTD files, appropriately modified to work with these DTD fragments, will be posted to the servers in the course of the next few weeks. We append the usual information on how to retrieve this chapter, for the convenience of subscribers. -C. M. Sperberg-McQueen Lou Burnard 26 November 1993 ----- Texts of P2 are being made available in a number of different electronic formats. These include plain screen-readable text (filetype DOC), LaTeX (filetype TEX), PostScript (filetype PS) and of course SGML (filetypes P2X and REF). In addition, many chapters define specific DTD files which are released together with the chapter. All files can be retrieved from any of several servers, as described below: 1 Getting files from the server in Chicago To get electronic copies of this fascicle from the TEI-L fileserver at the University of Illinois at Chicago, all you need to do is send an ordinary email note to the address LISTSERV@UICVM (or, from the Internet: listserv@uicvm.uic.edu) containing whichever of the following lines describe the file(s) you want: get p2ve doc get p2ve ps get p2ve p2x get p2ve ref get p2vedriv p2x get teivers2 ent get teivers2 dtd get teivegis dtd The DOC and PS files include the complete fascicles; the P2X file contains only this chapter. The file P2VEDRIV P2X is a 'driver file' which embeds the chapter file P2VE P2X and the accompanying reference material, P2VE REF. Further details on the DTD used may be had by contacting the editors. The documents you request will be returned to you automatically as e-mail messages. Beware! some of the files are quite large, and so may be delayed. You will also receive an automatic notification that the file is on its way to you. (If you receive something illegible in a 'Listserv packed format', please contact one of the editors directly to see about getting you the file in a more useful form.) 2 Getting files from the server in the UK The same files are available via anonymous FTP from the SGML Project at the University of Exeter. To access these files, your computer system must be on the InterNet. If it is, you should be able to give the command FTP sgml1.ex.ac.uk [ or FTP 144.173.6.61] When you are connected to the Exeter SGMLbox, type the following commands (or, whichever actually describe the files you want): cd tei/p2/drafts get p2ve.p2x get p2ve.doc get p2ve.ps get p2ve.ref cd ../dtds get teivers2.ent get teivers2.dtd get teivegis.dtd (note that the filename *must* be given in all lower-case letters) 3 Getting the files from other servers The files may also be obtained from the Markup-L Listserv fileserver in Germany, and from Professor Syun Tutiya in Japan. For more details on these and other sources of TEI information, please order copies of files EDJ8 MEMO (describes how to retrieve electronic copies of TEI P2 and the various formats in which they are available) EDJ9 MEMO (describes how to request paper copies of TEI P2, for those without electronic mail access) (on the Exeter file server, get file tei/intro/edj8.doc)