Lachesis Report post Posted March 16, 2005 Before still more effort is wasted upon translation, I would like to bring it up again: the encyclopedia format and code needs a deep cleanup. Should'nt it look more like: <encyclopedia> <title>The Eternal Lands Encyclopedia</title> <chapter> <title>Alchemy</title> <item> <name>Fire Essence</name> <usage>Used for smelting and Crafting</usage> ... <chapter> <title>Attributes</title> <section> <title>Base attributes</title> <base-attribute> <name>Physique</name> <cross-attribute>Might</cross-attribute> ... A.s.o. Layout, line breaking and page wrapping could be done in the client instead of manually in the XML files. We have XML, we have C, so let's use it as intended Would make things a lot easier for authors and translators. If you don't like layout information in the code, it can be put into a separate XML or an XSLT file. Wouldn't you like that? If you agree, I am going to suspend my OpenGL researches immediately in order to design a new encyclopedia format that satisfies the needs of authors and translators and rewrite the client code. Of course I would be even more happy if someone would volunteer for either that work or the atmospherics implementation (please see Roja's thread) so that both can be worked on With regards Lachesis Share this post Link to post Share on other sites
Grum Report post Posted March 16, 2005 Yes, this definitely the way the encyclopedia should work. If you could do this Lachesis, it would be highly appreciated, since the thing as it is, is a pain to translate. Share this post Link to post Share on other sites
Lachesis Report post Posted March 16, 2005 Yes, this definitely the way the encyclopedia should work. If you could do this Lachesis, it would be highly appreciated, since the thing as it is, is a pain to translate. That was the reason for me to make this post. I was fed up with the OpenGL redbook for the moment and had some spare time to spend, so I took a look at the Encyclopedia in order to start a translation ... and nearly fell off my seat Share this post Link to post Share on other sites
jamesvm Report post Posted March 16, 2005 (edited) when do this time add tag center text in encyclopedia window too please Edited March 16, 2005 by jamesvm Share this post Link to post Share on other sites
Lachesis Report post Posted March 17, 2005 I have some questions: 1) Move this into a new thread? 2) Is using C++ allowed? I would miss the STL badly if not. For the tag request, this is my current outline of the system: There will be two encyclopedia formats: one with semantic markup and one with visual formatting markup. The semantic markup is intended for authors and translators. For practical reasons, it will contain some information about language-specific formatting, such as different quotation marks, flow direction (if anyone ever wants arabic or traditional chinese ) or spacing rules. Designers can write a language-independent XSLT stylesheet that transforms the semantic markup into visual markup. This stylesheet must be smart enough to obey the above mentioned language-specific formatting rules. Only the transformed XML with the visual markup will be used for display. I wondered about stealing some functionality from wiki wiki web for automatic hyperlinks. I wouldn't like explicit hyperlinks since they require identifiers for semantic entities that would need to be retained somehow in the visual markup. I am going to do the transition of the English version to the new format myself, including the generic transformation stylesheet. However, expect it to take several months altogether, I'm afraid that I can spend at best 10 hours a week on it. Lachesis Share this post Link to post Share on other sites
Wytter Report post Posted March 17, 2005 I don't think that mixing C with C++ would be a good idea just for the encyclopedia. But yes, I agree that this is how the encyclopedia should work, and encyclopedia writers shouldn't really worry about formatting with n characters per line etc. - That must be done in the client instead. But it sounds great that you are willing to help us Lachesis Good luck with the project. Share this post Link to post Share on other sites
Grum Report post Posted March 17, 2005 I'm not too fond of mixing in C++ either. It sounds like you have good and clear ideas on how to do this Lachesis. Take your time, and do it right Share this post Link to post Share on other sites
Malaclypse Report post Posted March 17, 2005 I agree that there needs work with encyclopedia format. I already mentioned it in this thread It is the way files should look in the encyclopedia. But I was thinking of using an already available format. Instead of defining our own document type, why don't we use docbook? It's a well known, open xml format, that is widely used for such purposes. It gives us a pre-made DTD to validate the documents against, using libxml2. If you don't know docbook, here are some references: http://docbook.org/ http://www.oasis-open.org/docbook/ http://docbook.sourceforge.net/ It sounds like you are comfortable with this, Lachesis, so what do you think? I too don't like the idea much of mixing C and C++. Share this post Link to post Share on other sites
Lachesis Report post Posted March 17, 2005 Uh-oh, so you want me to implement a typesetter without the tools of OOP. Thats a challenge Which doesn't mean that I am not going to bear it Thank you for your hint, personally I like using existing technologies, Malaclypse, but this time I definitely prefer creating an own XML subset. The first reason is, the EL encyclopedia is a very special case. I would like to use this especialness to full capacity in order to keep the encyclopedia XML source as compact as possible. The second reason is, because of this especialness we only need a tiny XML subset, making the C implementation, the XSL transformation and the XML source as simple and compact as possible. Thus, I don't want docbook at all. Docbook is a comprehensive general-purpose description language for textual documents. Docbook contains more than 400 tags, that have complex, highly restrictive (though somewhat intuitive) nesting rules. Not only may it be hard to learn for authors and translators, it would cause serious code bloat and occupy my working resources rather for years than months. So, docbook is not an option to me, and probably neither are other multipurpose XML subsets. No offence meant, Malaclypse! I always appreciate your hints, but this time I just think it's not applicable. With kind regards Lachesis Share this post Link to post Share on other sites
Malaclypse Report post Posted March 17, 2005 Please, don't get me wrong, but the example xml you gave is only different from docbook in the root tag. Any of the other tags you used could be used in exactly the same way with docbook And I don't see it is hard to learn in our case. Encyclopedia is imho in now way special, with the exception of the content, that is the actual words and sentences that occur in the encyclopedia. We only need a small subset of the several hundreds of tags provided by docbook, for which we can provide a template to authors and translators, so would be no need to actually learn docbook. In fact, I suggested the use of docbook, because there are several editors out there, which could be used by authors, that don't have any knowledge of docboook nor of xml in general, so potentially increasing the number of contributors to this, whereas for our own schema, an author must have at least some knowledge of xml. But this is your decision, of course. so you want me to implement a typesetter without the tools of OOP Imo we don't need a typesetter. We need a schema, whether it's written as an XSD or as a DTD doesn't really matter and we need a couple of functions to read the help files and transform them, using xslt or another transformation, to our needs. Most work is already done in the libxml2 library, which we already use, and for which I don't see a reason not to use it further. The creation of a separate schema could be omited when using an already existing schema, that was the other reason I suggested docbook. I think it's actually less work for the implementor. Share this post Link to post Share on other sites
jamesvm Report post Posted March 17, 2005 (edited) what you ever people do make when encylopedia that allow scrolling bar in it and don't has worry about n characters those major promblem why so difficulf to work on. (and that has restart el client every time see the result too.) Other wise don't bother messing with encyclopedia code. but nice tag center stuff in encyclopedia like back to index link so don't space to do it Edited March 17, 2005 by jamesvm Share this post Link to post Share on other sites
Lachesis Report post Posted March 18, 2005 Please can someone translate James' post for me? I don't understand him. Mala, I know that what I posted looks like docbook, and that's not accidentally. However, please wait until the schema for the sematic markup is complete and you are probably going to understand me better. The main document structure may look like docbook, but the smaller parts will look completely different. Editors are an important point, however, generic XML editors will do it, as in most cases filling-in some values and possibly some copy & paste will be all that needs to be done And of course, we need typesetting code, in order to place the characters on the window And you don't want to do this using XSLT believe me Maybe I'm wording it wrong, let's talk private about that so that I don't need to talk in English. Thank you all for your comments Lachesis P.S. Please decide if you want scroll bars or automatic page breaks, I don't care about which one you choose but I need to know Share this post Link to post Share on other sites
Grum Report post Posted March 18, 2005 James, that's basically the idea, to remove the limitations on string lengths that currently exist, and to have tags that describe where an item should be placed instaed of having to type spaces and line beaks yourself. Share this post Link to post Share on other sites
jamesvm Report post Posted March 18, 2005 James, that's basically the idea, to remove the limitations on string lengths that currently exist, and to have tags that describe where an item should be placed instaed of having to type spaces and line beaks yourself. exactly ,but why does sound like Lachesis want rewrite whole xml language for encyclopedia when really only need do minor adjustment to code lanauges Share this post Link to post Share on other sites
Lachesis Report post Posted March 18, 2005 Because for automatic centering, line wrapping, page wrapping and so on I need to rewrite the code severely anyway. So I prefer to go all the way, seperating layout and content completely. I don't like ad-hoc coding, if I code something, I do it the best way I can I'm pretty sure you will like it Share this post Link to post Share on other sites
jamesvm Report post Posted March 18, 2005 Because for automatic centering, line wrapping, page wrapping and so on I need to rewrite the code severely anyway. So I prefer to go all the way, seperating layout and content completely. I don't like ad-hoc coding, if I code something, I do it the best way I can I'm pretty sure you will like it ok just make sure write code that it not more comple when add new or rewrite encyclopedia files. Share this post Link to post Share on other sites
Wytter Report post Posted March 18, 2005 The idea is that it'd become less complex to write encyclopedia files, and it'll give you some new tools that you don't have currently Share this post Link to post Share on other sites
jamesvm Report post Posted March 19, 2005 (edited) ok here question if encyclopedia going rework then does mean all file once code been changes has redone. so what point add to it then? if so would help if converted all page in encyclopedia to raw text again Edited March 19, 2005 by jamesvm Share this post Link to post Share on other sites
Lachesis Report post Posted March 21, 2005 Sorry James, but I cannot understand what you are saying again. I am not natively speaking English, please understand my difficulties. Sorry for inconvenience Lachesis Share this post Link to post Share on other sites
Malaclypse Report post Posted March 21, 2005 James, although the encyc. files need to be changed after the code changes, you should still add new items to it for now, I think. Don't convert them to plain text files, at least not at the moment. I think, it's better be done in one step, after the code changes are finished. Maybe it's even possible to do this semi-automatically, avoiding you, to completely rewrite all of the encyclopedia in the new format. Share this post Link to post Share on other sites
jamesvm Report post Posted March 21, 2005 (edited) ok and Lachesis make sure you take time to do good job of it or who ever do that Edited March 21, 2005 by jamesvm Share this post Link to post Share on other sites
Lachesis Report post Posted March 22, 2005 I think that probably a big part of the transformation to the new format (independently of its structure) will be able to be done automatically, though certainly not all. Of course I will try to make an as smart as possible conversion script, reducing the amount of manual editing as much as possible. Let's see how far we'll get with this, unfortunately it doesn't look easy. But maybe I'll even be able to do the manual editing myself if it's not much left. First of all, we need to define the semantic and visual markup format. I'm going to post examples for two possible semantic formats as a basis for discussion among encyclopedia authors and translators. After all, the intention of this work is making writing and translating articles as convenient as possible for them. Later on, I am going to present different models for visual markup. Seperating semantic and visual representation of the content introduces the possibility of (and encourages) some division of labour that has not existed so far. Until now, authors were also designing the visual presentation but this does not need to stay that way in future. So I don't know who will be most affected by visual markup, besides the client programmer that implements it. For those that are interested in it, I am currently considering the current encyclopedia format, an XSL-FO subset and a tiny custom XML language for candidates. But for now, the semantic markup format is most urgent to be defined. In addition, I have some question: how far is internationalization going to be driven in EL? The current 8-bit character format includes 7-bit ASCII, some color codes and specially defined non-ASCII characters. However we might quickly run out of available symbol representations if several non-latin-based symbol sets would be included such as greek, cyrillic, hebrew, arabic, japanese, chinese, hangul ... a. s. o. So maybe migration to UTF-8 encoding is planned*, and if so I would try to already prepare the code for this so that it does not need to be rewritten again soon. The same concern applies to other coding plans that affect encyclopedia in any way. Sorry for the long posting! With regards Lachesis *) We could even encode the color codes in a modified UTF-8, using codes beyond the Unicode range, such as 0x3000000-0x3FFFFFF. That way one could encode all 24-bit colors. Color codes could easily be detected in the string by searching for the leading byte 0xFB. We would need to write our own conversion functions but hey it's just a small piece of code Share this post Link to post Share on other sites
jamesvm Report post Posted March 22, 2005 (edited) Cool sound like you done lot planing I being think of reorganize encyclopedia files so that files themself not so huge. Problem use category use for NPC storage system. Edited March 22, 2005 by jamesvm Share this post Link to post Share on other sites
Wytter Report post Posted March 22, 2005 It has not been planned to use UTF-8 in the client or server for strings. But this could definately be a huge asset when it comes to overtaking the eastern markets. Problem is, that in the worst case scenario all strings would have to be 4x as large and it would increase memory consumption quite a bit... Share this post Link to post Share on other sites
Lachesis Report post Posted March 23, 2005 (edited) Problem is, that in the worst case scenario all strings would have to be 4x as large and it would increase memory consumption quite a bit... Yes, that's true. However, all messages except quest messages could be kept client-side, and the client does not need to hold them all in memory. So one could reduce the memory consumption again. Anyway, these are dreams of the future. Malaclypse asked me to drop the DocBook alternative for the semantic layout in order to get things done more quickly. So I am going to post only the XML schema + example file of the newly defined format in order to give you authors and translators an opportunity for complains. Though they are done, it will still need some time to prepare a location for publishing these and later other files related to this work. Happy Easter my mates! Lachesis Edit: The semantic markup definition schema is now available at Kl4uz' site. Thank you Klaus! Edited March 24, 2005 by Lachesis Share this post Link to post Share on other sites