<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Recent changes to patches</title><link>https://sourceforge.net/p/cwb/patches/</link><description>Recent changes to patches</description><atom:link href="https://sourceforge.net/p/cwb/patches/feed.rss" rel="self"/><language>en</language><lastBuildDate>Thu, 05 Apr 2018 07:18:41 -0000</lastBuildDate><atom:link href="https://sourceforge.net/p/cwb/patches/feed.rss" rel="self" type="application/rss+xml"/><item><title>#2 Add additional information to Makefile</title><link>https://sourceforge.net/p/cwb/patches/2/?limit=25#bb83</link><description>&lt;div class="markdown_content"&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;status&lt;/strong&gt;: accepted --&amp;gt; closed&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Stefan Evert</dc:creator><pubDate>Thu, 05 Apr 2018 07:18:41 -0000</pubDate><guid>https://sourceforge.net02125f6f38c19810cf5fff290c46ce547610d6b2</guid></item><item><title>#2 Add additional information to Makefile</title><link>https://sourceforge.net/p/cwb/patches/2/?limit=25#c9ac</link><description>&lt;div class="markdown_content"&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;status&lt;/strong&gt;: open --&amp;gt; accepted&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;assigned_to&lt;/strong&gt;: Stefan Evert&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Stefan Evert</dc:creator><pubDate>Thu, 05 Apr 2018 07:18:05 -0000</pubDate><guid>https://sourceforge.netb87d0e4f493b53154e513150f1924848ef8f9f92</guid></item><item><title>Add additional information to Makefile</title><link>https://sourceforge.net/p/cwb/patches/2/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;Please add license information as well as a hint to your web site at sourceforge to the CWB module. I could make it easier to contribute. &lt;br/&gt;
The information is shown nicely on metacpan at the left side. &lt;br/&gt;
Cf. &lt;a href="https://sourceforge.net/p/perl-trg/code/HEAD/tree/trunk/Makefile.PL"&gt;https://sourceforge.net/p/perl-trg/code/HEAD/tree/trunk/Makefile.PL&lt;/a&gt; for a nice example for a lot of additional Makefile information.&lt;/p&gt;
&lt;p&gt;HTH&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Alex</dc:creator><pubDate>Wed, 04 Apr 2018 19:30:12 -0000</pubDate><guid>https://sourceforge.net701bc1e8abed879d903cf70fb4edeb3c50efea61</guid></item><item><title>Add additional information to Makefile</title><link>https://sourceforge.net/p/cwb/patches/2/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;Ticket 2 has been modified: Add additional information to Makefile&lt;br/&gt;
Edited By: Stefan Evert (schtepf)&lt;br/&gt;
Status updated: u'open' =&amp;gt; u'accepted'&lt;br/&gt;
Owner updated: None =&amp;gt; u'schtepf'&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Alex</dc:creator><pubDate>Wed, 04 Apr 2018 19:30:12 -0000</pubDate><guid>https://sourceforge.netb3ef04fd22a29316f21fa8f7a46b3ab19811775c</guid></item><item><title>Add additional information to Makefile</title><link>https://sourceforge.net/p/cwb/patches/2/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;Ticket 2 has been modified: Add additional information to Makefile&lt;br/&gt;
Edited By: Stefan Evert (schtepf)&lt;br/&gt;
Status updated: u'accepted' =&amp;gt; u'closed'&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Alex</dc:creator><pubDate>Wed, 04 Apr 2018 19:30:12 -0000</pubDate><guid>https://sourceforge.neta7cd8306a06d48d77d733f11cba4d01512ca28fa</guid></item><item><title>#1 patches to PrintMode SGML</title><link>https://sourceforge.net/p/cwb/patches/1/?limit=25#0592</link><description>&lt;div class="markdown_content"&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;status&lt;/strong&gt;: open --&amp;gt; wont-fix&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Group&lt;/strong&gt;:  --&amp;gt; Unstable_(example)&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Stefan Evert</dc:creator><pubDate>Wed, 20 Jul 2016 11:45:00 -0000</pubDate><guid>https://sourceforge.net2c22903ccb843cf7395d459f0e90f227f6c1072e</guid></item><item><title>#1 patches to PrintMode SGML</title><link>https://sourceforge.net/p/cwb/patches/1/?limit=25#bace</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;We're not going to change this in CWB 3.5 as it may break other programs that expect the traditional delimiters. Users who require different output formats can patch the source code and recompile CQP.&lt;/p&gt;
&lt;p&gt;SGML, HTML and Latex output modes are deprecated and will no longer be supported in future releases (4.0 and later).&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Stefan Evert</dc:creator><pubDate>Wed, 20 Jul 2016 11:44:49 -0000</pubDate><guid>https://sourceforge.net4ece6ee5912870b5ae3786e0ac2b8c4a93ae0cd2</guid></item><item><title>patches to PrintMode SGML</title><link>https://sourceforge.net/p/cwb/patches/1/</link><description>There are two \(or three\) problems with SGML output which makes wrapping into another program difficult: 

1\. The attribute separator is "/", which is a problem when the word or an attribute contains "/", e.g.:

CORPUS&amp;gt; show +lemma;
CORPUS&amp;gt; "1/2";
&amp;lt;CONCORDANCE&amp;gt;
&amp;lt;attribute type=positional name="word" anr=0&amp;gt;
&amp;lt;attribute type=positional name="lemma" anr=1&amp;gt;
&amp;lt;LINE&amp;gt;\(...\)&amp;lt;CONTENT&amp;gt;\(...\) &amp;lt;MATCH&amp;gt;&amp;lt;TOKEN&amp;gt;1/2/1/2&amp;lt;/TOKEN&amp;gt;&amp;lt;/MATCH&amp;gt; \(...\)&amp;lt;/CONTENT&amp;gt;&amp;lt;/LINE&amp;gt;
\(...\)
&amp;lt;/CONCORDANCE&amp;gt;

As you can see, it's impossible to extract the attributes from the SGML. My suggestion is to use "&amp;lt;ATTR&amp;gt;" as the attribute separator instead, which will work since "&amp;lt;" and "&amp;gt;" are SGML escaped.

2\. When using an aligned corpus, the SGML in the aligned text is escaped:

CORPUS\_SWE&amp;gt; show +corpus\_nld -lemma;
CORPUS\_SWE&amp;gt; "veranda";
&amp;lt;CONCORDANCE&amp;gt;
&amp;lt;attribute type=positional name="word" anr=0&amp;gt;
&amp;lt;LINE&amp;gt;\(...\)&amp;lt;CONTENT&amp;gt; \(...\) &amp;lt;MATCH&amp;gt;&amp;lt;TOKEN&amp;gt;veranda&amp;lt;/TOKEN&amp;gt;&amp;lt;/MATCH&amp;gt; \(...\)&amp;lt;/CONTENT&amp;gt;&amp;lt;/LINE&amp;gt;
&amp;lt;align name="saltnld\_nld"&amp;gt;&amp;amp;lt;CONTENT&amp;amp;gt; \(...\) &amp;amp;lt;TOKEN&amp;amp;gt;veranda&amp;amp;lt;/TOKEN&amp;amp;gt; \(...\) &amp;amp;lt;TOKEN&amp;amp;gt;.&amp;amp;lt;/TOKEN&amp;amp;gt; &amp;amp;lt;/CONTENT&amp;amp;gt;
\(...\)
&amp;lt;/CONCORDANCE&amp;gt;

My suggestion is of course that the aligned text should not be escaped. ALso, that an "&amp;lt;/align&amp;gt;" be printed in the end.

3\. A smaller problem \(and not a bug at all\), is that the rows in the group output are contextual:

CORPUS&amp;gt; X = "de" \[\];
CORPUS&amp;gt; group X matchend lemma by match pos cut 50;
&amp;lt;TABLE&amp;gt;
&amp;lt;TR&amp;gt;&amp;lt;TD&amp;gt;DT&amp;lt;TD&amp;gt;\_\_UNDEF\_\_&amp;lt;TD&amp;gt;152&amp;lt;/TR&amp;gt;
&amp;lt;TR&amp;gt;&amp;lt;TD&amp;gt;PN&amp;lt;TD&amp;gt;vara&amp;lt;TD&amp;gt;146&amp;lt;/TR&amp;gt;
&amp;lt;TR&amp;gt;&amp;lt;TD&amp;gt;&amp;amp;nbsp;&amp;lt;TD&amp;gt;ha&amp;lt;TD&amp;gt;117&amp;lt;/TR&amp;gt;
&amp;lt;TR&amp;gt;&amp;lt;TD&amp;gt;&amp;amp;nbsp;&amp;lt;TD&amp;gt;skola&amp;lt;TD&amp;gt;100&amp;lt;/TR&amp;gt;
&amp;lt;TR&amp;gt;&amp;lt;TD&amp;gt;&amp;amp;nbsp;&amp;lt;TD&amp;gt;inte&amp;lt;TD&amp;gt;89&amp;lt;/TR&amp;gt;
&amp;lt;TR&amp;gt;&amp;lt;TD&amp;gt;&amp;amp;nbsp;&amp;lt;TD&amp;gt;komma&amp;lt;TD&amp;gt;80&amp;lt;/TR&amp;gt;
&amp;lt;TR&amp;gt;&amp;lt;TD&amp;gt;DT&amp;lt;TD&amp;gt;mången&amp;lt;TD&amp;gt;71&amp;lt;/TR&amp;gt;
&amp;lt;TR&amp;gt;&amp;lt;TD&amp;gt;&amp;amp;nbsp;&amp;lt;TD&amp;gt;där&amp;lt;TD&amp;gt;61&amp;lt;/TR&amp;gt;
&amp;lt;TR&amp;gt;&amp;lt;TD&amp;gt;&amp;amp;nbsp;&amp;lt;TD&amp;gt;andra,annan,två&amp;lt;TD&amp;gt;52&amp;lt;/TR&amp;gt;
&amp;lt;/TABLE&amp;gt;

The 3rd row 1st column contains "&amp;amp;nbsp;", which is a way of saying "the same as above". This is okay for ascii output and HTML output, but SGML is designed for computer readability, so personally I think that it shouldn't refer to earlier rows. Similar to "PrettyPrint off", which only works for "PrintMode ascii"...

My suggestion is that the group printer only prints &amp;amp;nbsp; if PrettyPrint is on.

4\. I did some digging in the source code, and it was pretty easy to do the necessary changes. \(Kudos to the programmers for making the code readable\). Only 4 lines are affected, here's a diff:

Index: sgml-print.c
===================================================================
\--- sgml-print.c	\(revision 182\)
+++ sgml-print.c	\(working copy\)
@@ -77,7 +77,7 @@

"&amp;lt;TOKEN&amp;gt;",                    /\* BeforeToken \*/
" ",                          /\* TokenSeparator \*/
\-  "/",                          /\* AttributeSeparator \*/
\+  "&amp;lt;ATTR&amp;gt;",                     /\* AttributeSeparator \*/
"&amp;lt;/TOKEN&amp;gt;",                   /\* AfterToken \*/

"&amp;lt;CONTENT&amp;gt;",                  /\* BeforeField \*/
@@ -213,7 +213,8 @@
sgml\_puts\(stream, "&amp;lt;align name=\"", 0\);
sgml\_puts\(stream, attribute\_name, 0\);
sgml\_puts\(stream, "\"&amp;gt;", 0\);
\-  sgml\_puts\(stream, line, SUBST\_ALL\);
\+  sgml\_puts\(stream, line, 0\); 
\+  sgml\_puts\(stream, "&amp;lt;/align&amp;gt;", 0\);

fputc\('\n', stream\);
\}
@@ -431,7 +432,7 @@

source\_id = group-&amp;gt;count\_cells\[cell\].s;

\-    if \(source\_id \!= last\_source\_id\) \{
\+    if \(\!pretty\_print || \(source\_id \!= last\_source\_id\)\) \{
last\_source\_id = source\_id;
sgml\_puts\(fd, Group\_id2str\(group, source\_id, 0\), SUBST\_ALL\);
nr\_targets = 0;
</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Anonymous</dc:creator><pubDate>Wed, 15 Sep 2010 08:57:56 -0000</pubDate><guid>https://sourceforge.netb8768220a2b8edca2b2ef071840839867666543d</guid></item><item><title>patches to PrintMode SGML</title><link>https://sourceforge.net/p/cwb/patches/1/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;Ticket 1 has been modified: patches to PrintMode SGML&lt;br/&gt;
Edited By: Stefan Evert (schtepf)&lt;br/&gt;
Status updated: u'open' =&amp;gt; u'wont-fix'&lt;br/&gt;
&lt;em&gt;milestone updated: '' =&amp;gt; u'Unstable&lt;/em&gt;(example)'&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Anonymous</dc:creator><pubDate>Wed, 15 Sep 2010 08:57:56 -0000</pubDate><guid>https://sourceforge.net4a49596b0eb4bcbe3c7e72d951bb6f9d629a7f5f</guid></item></channel></rss>