Discussion:
Add JSON, XML, CSV to Baby X resource compiler
(too old to reply)
Malcolm McLean
2024-05-19 10:16:27 UTC
Permalink
The Baby X resource compiler takes data - fonts, images, audio, strings
- and converts them into C source so that they can be read by C programs
without relying on external data files.

An obvious extension is to take in structured data. Adding SQL and
querying a database would unfortuately mean extending the program so
that it could only run on a large machine with a SQL server running, and
isn't really a viable proposition. However JSON, XML, and CSV are
commonly used to pass small to medium amounts of data about.

I've made a start on supporting CSV with the "<dataframe>" tag. CSV data
is tabular and two dimensional, and lends itself to an arrray of simple
structures. JSON And XML can of course represent more complex data, with
hierarchy. The dataframe tag is still very experimental. I've never used
it for anything practical.

So what would be the best approach to putting in JSON and XML support?

The project is here is you are not familiar with it. It's on github.

https://github.com/MalcolmMcLean/babyxrc
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
bart
2024-05-19 22:41:36 UTC
Permalink
Post by Malcolm McLean
The Baby X resource compiler takes data - fonts, images, audio, strings
- and converts them into C source so that they can be read by C programs
without relying on external data files.
An obvious extension is to take in structured data. Adding SQL and
querying a database would unfortuately mean extending the program so
that it could only run on a large machine with a SQL server running, and
isn't really a viable proposition. However JSON, XML, and CSV are
commonly used to pass small to medium amounts of data about.
I've made a start on supporting CSV with the "<dataframe>" tag. CSV data
is tabular and two dimensional, and lends itself to an arrray of simple
structures. JSON And XML can of course represent more complex data, with
hierarchy. The dataframe tag is still very experimental. I've never used
it for anything practical.
So what would be the best approach to putting in JSON and XML support?
I've only briefly used XML.

The problem with XML is that the data it represents is not just
hierarchical, but it can be chaotic. You can have one lot of data,
followed by another for something else with a different structure,
followed by other. It is just a container for disparate sets of data.

Even if the file does represent a simple list of records for example,
you won't know that without reading it and analysing it.

I looked online at an XML to CSV converter, which I thought would do
something clever, but it seems to just turn each XML line into one
string per line.

Maybe it doesn't matter; the user of your program knows what's in their
XML file, and will know what to do with the different bits. It's their
problem.

But you still have to figure out how to represent an arbitary data
structure as C data. Plus you have to deal with tag names, and attributes.

Personally, I would suggest using converters (ones clever than the CSV
one I tried) to turn XML files into better-organised formats first.
Michael S
2024-05-20 08:46:01 UTC
Permalink
On Sun, 19 May 2024 23:41:36 +0100
Post by bart
Post by Malcolm McLean
The Baby X resource compiler takes data - fonts, images, audio, strings
- and converts them into C source so that they can be read by C
programs without relying on external data files.
An obvious extension is to take in structured data. Adding SQL and
querying a database would unfortuately mean extending the program
so that it could only run on a large machine with a SQL server
running, and isn't really a viable proposition. However JSON, XML,
and CSV are commonly used to pass small to medium amounts of data
about.
I've made a start on supporting CSV with the "<dataframe>" tag. CSV
data is tabular and two dimensional, and lends itself to an arrray
of simple structures. JSON And XML can of course represent more
complex data, with hierarchy. The dataframe tag is still very
experimental. I've never used it for anything practical.
So what would be the best approach to putting in JSON and XML support?
I've only briefly used XML.
The problem with XML is that the data it represents is not just
hierarchical, but it can be chaotic. You can have one lot of data,
followed by another for something else with a different structure,
followed by other. It is just a container for disparate sets of data.
Even if the file does represent a simple list of records for example,
you won't know that without reading it and analysing it.
JSON is about the same.
Post by bart
I looked online at an XML to CSV converter, which I thought would do
something clever, but it seems to just turn each XML line into one
string per line.
Maybe it doesn't matter; the user of your program knows what's in
their XML file, and will know what to do with the different bits.
It's their problem.
But you still have to figure out how to represent an arbitary data
structure as C data. Plus you have to deal with tag names, and
attributes.
Personally, I would suggest using converters (ones clever than the
CSV one I tried) to turn XML files into better-organised formats
first.
Mikko
2024-05-20 09:23:28 UTC
Permalink
Post by bart
Post by Malcolm McLean
The Baby X resource compiler takes data - fonts, images, audio, strings
- and converts them into C source so that they can be read by C
programs without relying on external data files.
An obvious extension is to take in structured data. Adding SQL and
querying a database would unfortuately mean extending the program so
that it could only run on a large machine with a SQL server running,
and isn't really a viable proposition. However JSON, XML, and CSV are
commonly used to pass small to medium amounts of data about.
I've made a start on supporting CSV with the "<dataframe>" tag. CSV
data is tabular and two dimensional, and lends itself to an arrray of
simple structures. JSON And XML can of course represent more complex
data, with hierarchy. The dataframe tag is still very experimental.
I've never used it for anything practical.
So what would be the best approach to putting in JSON and XML support?
I've only briefly used XML.
The problem with XML is that the data it represents is not just
hierarchical, but it can be chaotic. You can have one lot of data,
followed by another for something else with a different structure,
followed by other. It is just a container for disparate sets of data.
Not just XML. JSON and may other formats have the same features.

In order to put a resouce to a C program my preference is to convert
the resource to an array of characters or bytes and process it the
same way it would be processed if it were read from a file.
--
Mikko
Malcolm McLean
2024-05-20 10:58:20 UTC
Permalink
Post by Mikko
Post by bart
Post by Malcolm McLean
The Baby X resource compiler takes data - fonts, images, audio,
strings - and converts them into C source so that they can be read by
C programs without relying on external data files.
An obvious extension is to take in structured data. Adding SQL and
querying a database would unfortuately mean extending the program so
that it could only run on a large machine with a SQL server running,
and isn't really a viable proposition. However JSON, XML, and CSV are
commonly used to pass small to medium amounts of data about.
I've made a start on supporting CSV with the "<dataframe>" tag. CSV
data is tabular and two dimensional, and lends itself to an arrray of
simple structures. JSON And XML can of course represent more complex
data, with hierarchy. The dataframe tag is still very experimental.
I've never used it for anything practical.
So what would be the best approach to putting in JSON and XML support?
I've only briefly used XML.
The problem with XML is that the data it represents is not just
hierarchical, but it can be chaotic. You can have one lot of data,
followed by another for something else with a different structure,
followed by other. It is just a container for disparate sets of data.
Not just XML. JSON and may other formats have the same features.
In order to put a resouce to a C program my preference is to convert
the resource to an array of characters or bytes and process it the
same way it would be processed if it were read from a file.
And of course the Baby X resource compiler already supports that. You
can convert XML or JSON to a string, and then run your own parser over
it at runtime.

But that isn't really a very good solution.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
Mikko
2024-05-21 08:27:46 UTC
Permalink
Post by Malcolm McLean
Post by Mikko
Post by bart
Post by Malcolm McLean
The Baby X resource compiler takes data - fonts, images, audio, strings
- and converts them into C source so that they can be read by C
programs without relying on external data files.
An obvious extension is to take in structured data. Adding SQL and
querying a database would unfortuately mean extending the program so
that it could only run on a large machine with a SQL server running,
and isn't really a viable proposition. However JSON, XML, and CSV are
commonly used to pass small to medium amounts of data about.
I've made a start on supporting CSV with the "<dataframe>" tag. CSV
data is tabular and two dimensional, and lends itself to an arrray of
simple structures. JSON And XML can of course represent more complex
data, with hierarchy. The dataframe tag is still very experimental.
I've never used it for anything practical.
So what would be the best approach to putting in JSON and XML support?
I've only briefly used XML.
The problem with XML is that the data it represents is not just
hierarchical, but it can be chaotic. You can have one lot of data,
followed by another for something else with a different structure,
followed by other. It is just a container for disparate sets of data.
Not just XML. JSON and may other formats have the same features.
In order to put a resouce to a C program my preference is to convert
the resource to an array of characters or bytes and process it the
same way it would be processed if it were read from a file.
And of course the Baby X resource compiler already supports that. You
can convert XML or JSON to a string, and then run your own parser over
it at runtime.
But that isn't really a very good solution.
Depends on the problem. If an application needs a resource that is
not needed by many other applications an ad hoc parser can produce
a result structure that a generic parser cannot do. Even then it may
be desiderable to use a format that could be parsed as XML or JSON.
--
Mikko
Scott Lurndal
2024-05-20 16:14:21 UTC
Permalink
Post by bart
Post by Malcolm McLean
The Baby X resource compiler takes data - fonts, images, audio, strings
- and converts them into C source so that they can be read by C programs
without relying on external data files.
An obvious extension is to take in structured data. Adding SQL and
querying a database would unfortuately mean extending the program so
that it could only run on a large machine with a SQL server running, and
isn't really a viable proposition. However JSON, XML, and CSV are
commonly used to pass small to medium amounts of data about.
I've made a start on supporting CSV with the "<dataframe>" tag. CSV data
is tabular and two dimensional, and lends itself to an arrray of simple
structures. JSON And XML can of course represent more complex data, with
hierarchy. The dataframe tag is still very experimental. I've never used
it for anything practical.
So what would be the best approach to putting in JSON and XML support?
I've only briefly used XML.
That's clear from what you write below.
Post by bart
The problem with XML is that the data it represents is not just
hierarchical, but it can be chaotic. You can have one lot of data,
followed by another for something else with a different structure,
followed by other. It is just a container for disparate sets of data.
Even if the file does represent a simple list of records for example,
you won't know that without reading it and analysing it.
Study XML Schema. https://en.wikipedia.org/wiki/XML_Schema_(W3C)

Then study XSL. https://en.wikipedia.org/wiki/XSLT

XML is a markup language. A subset of SGML. HTML is a non-proper
and non-regular subset of XML.

It's far more flexible and useful than a set of comma-separated-values.
Post by bart
I looked online at an XML to CSV converter, which I thought would do
something clever, but it seems to just turn each XML line into one
string per line.
You use stylesheets (XSL) with a stylesheet processor to make
arbitrary transformations to an XML document. The output can
be XML, HTML, CSV, or any custom format required for an application.
bart
2024-05-20 19:27:15 UTC
Permalink
Post by Scott Lurndal
Post by bart
Post by Malcolm McLean
The Baby X resource compiler takes data - fonts, images, audio, strings
- and converts them into C source so that they can be read by C programs
without relying on external data files.
An obvious extension is to take in structured data. Adding SQL and
querying a database would unfortuately mean extending the program so
that it could only run on a large machine with a SQL server running, and
isn't really a viable proposition. However JSON, XML, and CSV are
commonly used to pass small to medium amounts of data about.
I've made a start on supporting CSV with the "<dataframe>" tag. CSV data
is tabular and two dimensional, and lends itself to an arrray of simple
structures. JSON And XML can of course represent more complex data, with
hierarchy. The dataframe tag is still very experimental. I've never used
it for anything practical.
So what would be the best approach to putting in JSON and XML support?
I've only briefly used XML.
That's clear from what you write below.
I have enough experience to know that it CAN represent disparate data
just like I said, since I've had to generate exactly such files as input
into another application that required such data.
Post by Scott Lurndal
Post by bart
Even if the file does represent a simple list of records for example,
you won't know that without reading it and analysing it.
Study XML Schema.
I have no interest in studying XML or every using again. I already
stated in an earlier post that it's more complicated than it looks.


https://en.wikipedia.org/wiki/XML_Schema_(W3C)
Post by Scott Lurndal
Then study XSL. https://en.wikipedia.org/wiki/XSLT
XML is a markup language. A subset of SGML. HTML is a non-proper
and non-regular subset of XML.
It's far more flexible and useful than a set of comma-separated-values.
And, therefore, 'chaotic' it what can be represented, even if
technically it can be described by a recursively defined syntax.

Or you need to know is that XML can represent the syntactic structure of
most programming languages, and we all know how easy that is to
represent as a fixed set of initialised C data structures hardcoded into
a source file.
Post by Scott Lurndal
Post by bart
I looked online at an XML to CSV converter, which I thought would do
something clever, but it seems to just turn each XML line into one
string per line.
You use stylesheets (XSL) with a stylesheet processor to make
arbitrary transformations to an XML document. The output can
be XML, HTML, CSV, or any custom format required for an application.
The OP wanted to be able to directly process XML; I suggest that it
first be transformed into something more regular.
Malcolm McLean
2024-05-21 11:59:28 UTC
Permalink
Post by bart
Post by Scott Lurndal
Post by bart
Post by Malcolm McLean
The Baby X resource compiler takes data - fonts, images, audio, strings
- and converts them into C source so that they can be read by C programs
without relying on external data files.
An obvious extension is to take in structured data. Adding SQL and
querying a database would unfortuately mean extending the program so
that it could only run on a large machine with a SQL server running, and
isn't really a viable proposition. However JSON, XML, and CSV are
commonly used to pass small to medium amounts of data about.
I've made a start on supporting CSV with the "<dataframe>" tag. CSV data
is tabular and two dimensional, and lends itself to an arrray of simple
structures. JSON And XML can of course represent more complex data, with
hierarchy. The dataframe tag is still very experimental. I've never used
it for anything practical.
So what would be the best approach to putting in JSON and XML support?
I've only briefly used XML.
That's clear from what you write below.
I have enough experience to know that it CAN represent disparate data
just like I said, since I've had to generate exactly such files as input
into another application that required such data.
Post by Scott Lurndal
Post by bart
Even if the file does represent a simple list of records for example,
you won't know that without reading it and analysing it.
Study XML Schema.
I have no interest in studying XML or every using again. I already
stated in an earlier post that it's more complicated than it looks.
   https://en.wikipedia.org/wiki/XML_Schema_(W3C)
Post by Scott Lurndal
Then study XSL.     https://en.wikipedia.org/wiki/XSLT
XML is a markup language.   A subset of SGML.  HTML is a non-proper
and non-regular subset of XML.
It's far more flexible and useful than a set of comma-separated-values.
And, therefore, 'chaotic' it what can be represented, even if
technically it can be described by a recursively defined syntax.
Or you need to know is that XML can represent the syntactic structure of
most programming languages, and we all know how easy that is to
represent as a fixed set of initialised C data structures hardcoded into
a source file.
That is one answer. XML and JSON are trees, with certain contraints on
the nodes. So we could devise a C struct which represents a node, and
spit out a rooted tree.
Post by bart
Post by Scott Lurndal
Post by bart
I looked online at an XML to CSV converter, which I thought would do
something clever, but it seems to just turn each XML line into one
string per line.
You use stylesheets (XSL) with a stylesheet processor to make
arbitrary transformations to an XML document.  The output can
be XML, HTML, CSV, or any custom format required for an application.
The OP wanted to be able to directly process XML; I suggest that it
first be transformed into something more regular.
That seems to be a better approach. Data represents something in the
real world. And normally there are a lot of constraints on it. A
temperature measurement might be missing, but it must be real, it can't
be imaginary. Even though XML will of course allow you to put "<REAL>"
and "<IMAGINARY>" tags under the temperature tag if you so desire.

But how to do that whilst keeping the program usable?
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
Ben Bacarisse
2024-05-22 11:43:43 UTC
Permalink
Post by bart
Post by Scott Lurndal
Post by bart
Post by Malcolm McLean
The Baby X resource compiler takes data - fonts, images, audio, strings
- and converts them into C source so that they can be read by C programs
without relying on external data files.
An obvious extension is to take in structured data. Adding SQL and
querying a database would unfortuately mean extending the program so
that it could only run on a large machine with a SQL server running, and
isn't really a viable proposition. However JSON, XML, and CSV are
commonly used to pass small to medium amounts of data about.
I've made a start on supporting CSV with the "<dataframe>" tag. CSV data
is tabular and two dimensional, and lends itself to an arrray of simple
structures. JSON And XML can of course represent more complex data, with
hierarchy. The dataframe tag is still very experimental. I've never used
it for anything practical.
So what would be the best approach to putting in JSON and XML support?
I've only briefly used XML.
That's clear from what you write below.
I have enough experience to know that it CAN represent disparate data
just like I said, since I've had to generate exactly such files as input
into another application that required such data.
Post by Scott Lurndal
Post by bart
Even if the file does represent a simple list of records for example,
you won't know that without reading it and analysing it.
Study XML Schema.
I have no interest in studying XML or every using again. I already stated
in an earlier post that it's more complicated than it looks.
   https://en.wikipedia.org/wiki/XML_Schema_(W3C)
Post by Scott Lurndal
Then study XSL.     https://en.wikipedia.org/wiki/XSLT
XML is a markup language.   A subset of SGML.  HTML is a non-proper
and non-regular subset of XML.
It's far more flexible and useful than a set of comma-separated-values.
And, therefore, 'chaotic' it what can be represented, even if technically
it can be described by a recursively defined syntax.
Or you need to know is that XML can represent the syntactic structure of
most programming languages, and we all know how easy that is to represent
as a fixed set of initialised C data structures hardcoded into a source
file.
That is one answer. XML and JSON are trees, with certain contraints on the
nodes. So we could devise a C struct which represents a node, and spit out
a rooted tree.
Post by bart
Post by Scott Lurndal
Post by bart
I looked online at an XML to CSV converter, which I thought would do
something clever, but it seems to just turn each XML line into one
string per line.
You use stylesheets (XSL) with a stylesheet processor to make
arbitrary transformations to an XML document.  The output can
be XML, HTML, CSV, or any custom format required for an application.
The OP wanted to be able to directly process XML; I suggest that it first
be transformed into something more regular.
That seems to be a better approach. Data represents something in the real
world. And normally there are a lot of constraints on it. A temperature
measurement might be missing, but it must be real, it can't be
imaginary. Even though XML will of course allow you to put "<REAL>" and
"<IMAGINARY>" tags under the temperature tag if you so desire.
It's already been pointed out, but the reason people use XML is that XML
files can be validated again a schema that can prevent exactly this kind
of error.
--
Ben.
Malcolm McLean
2024-05-25 17:18:05 UTC
Permalink
Post by bart
Post by Scott Lurndal
Post by bart
Post by Malcolm McLean
The Baby X resource compiler takes data - fonts, images, audio, strings
- and converts them into C source so that they can be read by C programs
without relying on external data files.
I looked online at an XML to CSV converter, which I thought would do
something clever, but it seems to just turn each XML line into one
string per line.
You use stylesheets (XSL) with a stylesheet processor to make
arbitrary transformations to an XML document.  The output can
be XML, HTML, CSV, or any custom format required for an application.
The OP wanted to be able to directly process XML; I suggest that it
first be transformed into something more regular.
I've made an attempt.

The approach is to convert JSON and CSV to XML, and then handle the data
with one unified function.
So the user has three options

/* This one just makes a best guess for what you want */
<dataframe src="mydata.xml" />

/* This one tells the program that the records are in nodes with the tag
"record" */
<dataframe src="mydata.xml" xpath="//record" />

/* This one declares your own structure, picking fields from the xml
document */
<dataframe src="mydata.xml">
<field name="title" />
<field name="author" xpath="books/authors/***@name" />
<field name="copyright">
<field name="year" xpath="//copyright/year" />
<field name="owner" xpath="books/authors/***@name" />
</field>
</dataframe>

It's basically working, but it needs quite a bit more work to allow the
user tighter control of types.

To explain, a "dataframe" is 2 dimenensional table of values, which can
be numbers or non-numerical values. So most statistics are done with
dataframes.

But of course XML can also represent trees, which are a diferent ball
game. That would be a different tag.

Anyone wants to have a play, or dive in with suggestions, it's all on
github. (https://github.com/MalcolmMcLean/babyxrc the changes are on the
branch topic/dataimport).
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
Loading...