The list of PhD Proposals-CISUC has nine research groups

CISUC has nine research groups:

G1: Computer Science

G2: Artificial Intelligence: Foundations and Applications

G3: Simulation and Information Technologies in Education and Training

G4: Adaptive Computation

G5: Dependable Systems

G6: Communications and Telematics

G7: Databases

G8: Information Systems

G9: Evolutionary and Complex Systems

The list of PhD Proposals is pided by each group.

Each proposal has a reference Gx.y (x is the number of the group; y is the number of

the proposal within that group). Use these references to fill in the Scholarship

Application Form.

CISUC will offer 9 scholarships in the upcoming year. Students that do not have any

other sources of funding can apply to these scholarships. The scholarship includes the

payment of the tuition fees and a monthly payment of 980€. Students should be aware

that this is a scholarship only for the first year. Financial support for the remaining

years of the PhD should be of the responsibility of the Supervisor. It is advisable that

students should first discuss with their future Supervisor about the possibilities for

financial support for the rest of the PhD program.

Candidates should choose 3 Thesis Proposals, listed by order of preference. The chosen

proposals can be from the same or different groups.

Each group of CISUC has one scholarship to offer. The best candidate in each group

will get the scholarship. The selection of the candidates that will get a scholarship is not

a global process, but rather a distributed process with nine independent queues of

selection. Each group is responsible for the selection of the best candidate, and the final

list will be approved by the Scientific Commission of CISUC.

In the following pages we present a list of proposed PhD Thesis for the next year (2007-

2009). The candidates to the scholarship should read this list carefully and choose the

3 proposals that better fit into the interests and research background.

Deadline to apply for a scholarship: 21st of July, 2006

G1: Computer Science

PhD Thesis Proposal: G1.1

Title: “Human readable ATP proofs in Euclidean Geometry”

Keywords: Automatic Theorem Proving, Axiomatic Proofs in Euclidean Geometry.

Supervisor: Prof. Pedro Quaresma (


Automated theorem proving (ATP) in geometry has two major lines of research:

axiomatic proof style and algebraic proof style (see [6], for instance, for a survey).

Algebraic proof style methods are based on reducing geometry properties to algebraic

properties expressed in terms of Cartesian coordinates. These methods are usually very

efficient, but the proofs they produce do not reflect the geometry nature of the problem

and they give only a yes/no conclusion. Axiomatic methods attempt to automate

traditional geometry proof methods that produce human-readable proofs. Building on

top of the existing ATPs (namely GCLCprover [5, 4, 8, 9, 10] to the area method [1, 2, 3,

7, 8, 11] or ATPs dealing with construction [6] the goal is to built an ATP capable of

producing human-readable proofs, with a clean connection between the geometric

conjectures and theirs proofs.


PhD Thesis Proposal: G1.2

Title: “Formal languages in the knowledge base management”

Keywords: Formal language, knowledge base, knowledge base inconsistency,

knowledge base management.

Supervisor: Prof. Maria de Fátima Gonçalves (


Knowledge base management is intended as the acquisition and normalization of new

knowledge and the confrontation of this knowledge with existing knowledge, resolving

potential conflicts and updating it. When updating a knowledge base, several problems

may arise. Some of them are the redundancy of the updated knowledge and the

knowledge base inconsistency. Formal languages can be used with success to a

development of an innovative system to do the knowledge base management, resolving

potential updating problems.

PhD Thesis Proposal: G1.3

Title: “Medidas de semelhança e aplicações”

Keywords: Kolmogorov complexity, normalized distances, machine learning,

clustering, universal similarity metric.

Supervisor: Prof. Ana Maria de Almeida (


Um problema recorrente em muitas estratégias que recorrem à apropriação de

conhecimento (learning) para actuar é o da identificação de "semelhanças". Como

podemos medir "semelhança" para, por exemplo, determinar a distância evolutiva

(evolucionária?) entre duas sequências como, por exemplo, 2 documentos na Internet,

ou 2 programas de computador, ou duas pautas musicais, ou 2 genomas, ou 2 linhas


Este projecto pretende estudar a área emergente das medidas de semelhança (similarity

distance measures), úteis em data-mining, pattern recognition, learning e automatic

semantic extraction. Vitány et al criaram a "normalized information distance", baseda

na noção de Complexidade Algorítmica da Informação (vulgarmente conhecida como

Kolmogorov Complexity), e que mostraram ser uma medida universal para a classe em

causa, pois descobre todas as semelhanças efectivas. Mostraram ainda que é uma

métrica e que toma valores em [0,1]. Existem já vários exemplos práticos interessantes

em termos de resultados finais: semelhanças moleculares, identificação de

compositores através de pautas de música, ecocardiografias fetais, entre outros. A

teoria é geral e, como tal, aplicável a persas áreas ou colecções de objectos. Pretendese

investigar a sua adequação prática a casos particulares como learning algorithms

(kernel function) e clustering.

PhD Thesis Proposal: G1.4

Title: “A entropia (de Kolmogorov) na definição de modelos de Taxa-Distorção (R-D)

para a codificação de sinal de video”

Keywords: Kolmogorov complexity, Entropy, Rate-Distortion Theory.

Supervisor: Prof. Ana Maria de Almeida (


Na última década, a Complexidade Algorítmica da Informação (ou Kolmogorov

Complexity) impôs-se como suporte teórico de uma série de aplicações nas mais

persas áreas, permitindo a (re)definação de modelos e conceitos. Nomeadamente, e

devido a uma interessante ligação forte entre a noção estatística da medida de Entropia

de Shannon e a determinista medida de CAI, tem-se assistido, não só a demonstrações

persas desta ligação teórica, mas também à realização de estudos e modelos que

mostram que a CAI pode substituir a Entropia como, e apenas como exemplo, em

Criptografia, com a vantagem de já não ser necessário conhecer distribuições de

probabilidade que, na verdade, não são, até agora realmente conhecidas.

Pretende este projecto, e uma vez que já é conhecido um modelo teórico, paralelo ao

estudo de Shannon, para substituir a Entropia na avaliação da Teoria de R-D (Rate-

Distortion), sobre a qual se baseia a transmissão de sinal, averiguar sobre a possivel

implementação prática deste novo modelo, à optimização de funções de R-D na

codificação de sinal de video.

G2: Artificial Intelligence: Foundations and


PhD Thesis Proposal: G2.1

Title: “Computational Aesthetics”

Keywords: Evolutionary Art, Artificial Artists, Computational Aesthetics

Supervisor: Prof. Penousal Machado (


The use of biological inspired techniques for image generation tasks is a recent, exciting

and significant area of research. There is a growing interest in the application of these

techniques in fields such as: visual art and music generation, analysis, and

interpretation; sound synthesis; architecture; video; design; etc. In most cases, these

systems resort to a human guided evolutionary algorithm: the user provides fitness

scores for the images thus steering evolution towards images that match his/hers

aesthetic preferences.

Although these systems are interesting in their own right as computer aided creativity

tools, they lack autonomy and the capacity of making their own aesthetic judgements,

consequently, they cannot be considered artificial artists. According to the framework

proposed by Machado et al. (2003) an artificial artist can be seen as a system composed

by two modules: a creator and a critic.

The stunning imagery created with interactive evolutionary art tools indicates that an

evolutionary computation approach is suitable for the creator role. As such, the

bottleneck lies in the development of adequate artificial critics.

In Machado et al. (2004) the authors propose a generic framework for the development

of artificial art critics composed by two main modules: a feature extractor, which is

responsible for the “perception” of the artwork, collecting a set of low level aesthetically

relevant features; an evaluator, which performs an assessment of the aesthetic merits of

the artwork based on the output of the feature extractor module.

The proposed approach was tested in several author and style identification tasks

achieving high success rates (]95%), which shows the feasibility of the approach.

Following the previous work of Machado et al. (e.g. 2003, 2004) the present thesis will

focus on the development of artificial art critics and on their integration with existent

interactive art tools. The key issues to be addressed are the further development of the

current feature extractor, the analysis of the relevance of the incorporated features, and

the online training of the evaluator module.


PhD Thesis Proposal: G2.2

Title: “Stylistic Based Image Retrieval and Classification”

Keywords: Content Based Image Retrieval, Artificial Art Critics, Computational


Supervisor: Prof. Penousal Machado (


The days of the text based World Wide Web are over. Today the web is dominated by

multimedia content. Nevertheless, the most popular search engines are text-based.

Although a few search engines for finding images and video are available, these engines

are still based on text. The large commercial image providers use human indexers to

select keywords and to classify each of the images. Google and other popular search

engines allowing image or video search base the retrieval on the textual content of the

page or on the keywords inserted by a human user.

Content based image retrieval is a complex task. Taken to the limit it is a generic vision

problem, requiring object recognition, image understanding, concept formation, object

classification, content analysis, etc. Computer vision is probably one of the hardest

problems in Artificial Intelligence and one that has been baffling computer scientists

during the last decades. Although significant progress has been made, it is not likely to

be solved in the near future. This poses limits on what content Based Image Retrieval

can accomplish; however it also creates opportunities.

In Machado et al. (2004) that authors presented a system composed by two main

modules: a feature extractor, which is responsible for the “perception” of the image; an

evaluator, which performs an assessment of the image based on the output of the

feature extractor module. The feature extractor is composed by a set of low-level

features that proved to be relevant to the stylistic classification of images. The proposed

approach was tested in several author and style identification tasks achieving high

success rates (]95%), which shows the feasibility of the approach and its potential for

the development of stylistic based image retrieval engines.

The main objective of this thesis is the further development of the current feature

extractor, focusing on the incorporation of features proved to be relevant for content

based image retrieval tasks. Unlike other content based image retrieval systems, which

focus on the identification of images containing a given set of objects, we are primarily

interested on the retrieval of stylistically similar images. Therefore, particular emphasis

will be given to features that are able to capture stylistic characteristics. Additionally,

the development of a fully working prototype and its empirical testing are key aspects of

this thesis.


PhD Thesis Proposal: G2.3

Title: “Image Representation for Evolutionary computation”

Keywords: Evolutionary Art, Programmatic Compression

Supervisor: Prof. Penousal Machado (


The use of evolutionary computation techniques for image generation tasks is a recent,

exciting and significant area of research. Following the footsteps of Karl Sims, most

systems resort to the evolution of symbolic expressions. Once evaluated these

expressions result in images. The tree-like nature of the expressions allows their

meaningful manipulation through conventional Genetic Programming operators, and

the results attained by several researchers during the past years show that this is a

powerful image generation method. Although expression based representations prove to

be adequate in the context of an evolutionary approach, they suffer from a major

problem. The generation of an image from a given expression is straightforward,

however the inverse problem – finding the symbolic expression for a given image – is

(NP) hard. The goals of this thesis are twofold: 1) explore alternative image

representation schemes, e.g. line based representations, that are adequate for

evolutionary computation but that do no suffer from the aforementioned problem; 2)

Following the work of McGuire, Nordin, and others explore the feasibility of using

programmatic image compression methods in order to find compact symbolic

expressions for a given image.

PhD Thesis Proposal: G2.4

Title: “Evolution of Dynamic Reactive Artworks”

Keywords: Evolutionary Art, Fitness Automation

Supervisor: Prof. Penousal Machado (


The use of evolutionary computation techniques for image generation tasks is a recent,

exciting and significant area of research. During the past few years we watched the

emergence of several evolutionary art tools. Most of these tools are interactive, in the

sense that the user guides the evolutionary process, thus steering evolution towards

images that match his/hers aesthetic preferences. However, the resulting artworks are

static images and not interactive installations. The typical evolutionary art tool evolves

programs, once executed this programs result in images. The programs take no input;

as a result their output is static. The objective of this thesis is the development of and

evolutionary art tool able to produce dynamic and reactive artworks. To achieve this

task the programs being evolved receive an input signal, e.g. a hand gesture, their

output depends on the input. Therefore different “gestures” will yield different images,

which results in a dynamic reactive artwork. The programs being evolved can be seen

as mappings between the input signal and images. The difficulty lies on the

development of interesting mappings. Given the time-based nature of the domain

resorting to human guided evolution is not a viable option. Instead we wish to explore

ways of automating fitness assignment, so that uninteresting mappings can be

eliminated beforehand. Additionally, and although “gestures” are a natural choice for

input signal, others exist and should be explored in this thesis, allowing the evolution of

artworks that react to sound, lighting, bio-signals, etc.

PhD Thesis Proposal: G2.5

Title: “An Affect-based Multi-Agent System”

Keywords: Affect, emotion, autonomous agents, multi-agent systems

Supervisor: Prof. Luís Macedo (


Emotion and motivation (merged in the broader term affect) are essential for survival,

well-being and communication in humans by, among other functions, playing a central

role on cognitive activities such as decision-making, planning and creativity. So, the

question is why artificial agents don’t take advantage of emotions and motivations as

humans do? What can emotional artificial agents do better than those that are not

based on emotion? What emotion and motivation can offer to artificial agents?

Certainly, not all advantages that humans benefit are applicable to artificial agents.

But, we might think in a series of situations in which we can see the emotional

advantage such as in text-to-speech systems by giving more intonation to speech,

entertainment, preventive medicine, helping autistic people, in artificial pets,

personalized agents that can act on the behalf of someone by selecting news, music,

etc., according to someone’s mood, consumer feedback by measuring the emotions of

consumers when dealing with a specific product, etc. Such applications require the

abilities to recognize, express and experience emotions. Research in Artificial

Intelligence has almost ignored this significant role of emotions on reasoning, and only

recently this issue was taken seriously mainly because of the recent advances in

neuroscience, which have given evidence that cognitive tasks of humans, and

particularly planning and decision-making, are influenced by emotion .

The research question/thesis statement is that those tasks mentioned above can be

robustly performed by affective agents. The approach comprises the development of a

multi-agent system comprising affective agents [Macedo & Cardoso, 2004] so that this

framework can be used to build agent-based applications.


PhD Thesis Proposal: G2.6

Title: “Collaborative Multi-Agent Exploration of 3-D Dynamic Environments”

Keywords: Collaborative Multi-Agent Exploration of 3-D Dynamic Environments

Supervisor: Prof. Luís Macedo (


Exploration gathers information about the unknown. Exploration of unknown

environments by artificial agents (usually mobile robots) has actually been an active

research field [Macedo & Cardoso, 2004]. The exploration domains include planetary

exploration (e.g., Mars or lunar exploration), search for meteorites in Antarctica, volcano

exploration, map-building of interiors, etc. Several exploration techniques have been

proposed and tested either in simulated and real, indoor and outdoor environments,

using single or multiple agents. The main advantage of multi-agent approaches is to

avoid covering the same area by two or more agents. However, there is still much to be

done especially in dynamic environments as those mentioned above. Besides, real

environments, however, consist of objects. For example, office environments possess

chairs, doors, garbage cans, etc., cities comprise several kinds of buildings (houses,

offices, hospitals, churches, etc.), cars, etc. Many of these objects are non-stationary,

that is, their locations may change over time. This observation motivates research on a

new generation of mapping algorithms, which represent environments as collections of

objects. At a minimum, such object models would enable a robot to track changes in

the environment. For example, a cleaning robot entering an office at night might realize

that a garbage can has moved from one location to another. It might do so without the

need to learn a model of this garbage can from scratch, as would be necessary with

existing robot mapping techniques. This thesis addresses the problem of finding multiagent

strategies to address the problem of collaborative exploration of unknown, 3-D,

dynamic environments. The strategy or strategies should be tested against other

exploration strategies found in the literature.


PhD Thesis Proposal: G2.7

Title: “Case-Based Hierarchical-Task Network Planning”

Keywords: HTN Planning, Case-based Planning, Decision-theoretic planning

Supervisor: Prof. Luís Macedo (


Hierarchical-Task Network (HTN) planning is a planning methodology that is more

expressive than STRIPS-style planning. Given a set of tasks that need to be performed

(the planning problem), the planning process decomposes them into simpler subtasks

until primitive tasks or actions that can be directly executed are reached. Methods

provided by the domain theory indicate how tasks are decomposed into subtasks.

However, for many real-world domains, sometimes it is hard to collect methods to

completely model the generation of plans. For this reason an alternative approach that

is based on cases of methods has been taken in combination with methods. Real-world

domains are usually dynamic and uncertain. In these domains actions may have

several outcomes, some of which may be more valuable than others. Planning in these

domains require special techniques for dealing with uncertainty. Actually, this has been

one of the main concerns of the planning research in the last years, and several

decision-theoretic planning approaches has been proposed and used successfully, some

based on the extension of classical planning and others on Markov-Decision Processes.

In these decision-theoretic planning frameworks actions are usually probabilistic

conditional actions, preferences over the outcomes of the actions is expressed in terms

of an utility function, and plans are evaluated in terms of their expected utility. The

main goal is to find the plan or set of plans that maximizes an expected utility function,

i.e, to find the optimal plan. In this thesis a planner that combines the technique of

decision-theoretic planning with the methodology of HTN planning should be built in

order to deal with uncertain, dynamic large-scale real-world domains [Macedo &

Cardoso, 2004]. Unlike in regular HTN planning, methods for task decomposition

shouldn’t be used, but instead cases of plans. The planner should generate a variant of

a HTN - a kind of AND/OR tree of probabilistic conditional tasks - that expresses all the

possible ways to decompose an initial task network.


PhD Thesis Proposal: G2.8

Title: “Ontology Learning from Text in Portuguese”

Keywords: Ontology Learning, Information Extraction, Natural Language Processing

Supervisor: Prof. Paulo Gomes (


Much of today’s Knowledge is gathered in a textual format (e.g. imagine the amount of

knowledge available on the web), as such, the extraction and mining of knowledge from

texts becomes an important dimension of current research. One way of representing

knowledge is through the use of an ontology. Simply put, an ontology is a shared

understanding of some domain of interest where hidden connections between concepts

are made explicit. The use of such a structure allows communities and computer

systems to share a consistent understanding of what information means; in other words

its semantics.

The main objective of this thesis is the development of a set of methodologies that allow

the extraction of important concepts (relative to a certain domain) from text along with

the implicit and explicit relations that hold between them. Another aspect the thesis

should address pertains to the evaluation of the knowledge extracted and its

applicability in information systems.

PhD Thesis Proposal: G2.9

Title: “Intelligent Knowledge Management using the Semantic Web”

Keywords: Semantic Web, Knowledge Management, Artificial Intelligence, Ontologies

Supervisor: Prof. Paulo Gomes (


Nowadays, companies gather and store big amounts of information in databases. This

information presents potential high value knowledge for a company. But most of this

information or data is not transformed in knowledge, remaining lost in data bases or

document repositories. Software development is a knowledge intensive activity involving

several types of know-how and skills. Usually development teams have several

members, which makes sharing and dissemination of knowledge crucial for project

success. One evolving technology that can be used with the purpose of building

knowledge management tools for the software development area is the semantic web.

Semantics are the lost chain between information/data and knowledge, and the

semantic web provides the infrastructure needed for making a true sharing of

knowledge possible.

The semantic web is an infrastructure providing semantics associated with words in

web resources. But, by itself it does not provide a tool for knowledge management. What

are needed, are tools that enable the usage of the semantic web in an intelligent way, so

that users can take advantage of knowledge sharing. The main problem to be dealt with

in this thesis is how a team of software development engineers can be aided by a tool, or

a set of tools, that enable them to reuse knowledge in a more efficient way, thus

increasing their productivity.

The main objective of this thesis is to develop a set of tools based on the semantic web.

These tools are intended to have a set of intelligent characteristics, such as: learning,

proactive reasoning, semantic searching and retrieval of knowledge, representation of

knowledge, knowledge acquisition, personalization, and others. Several reasoning

methods have been developed in Artificial Intelligence and are ideal candidates to be

used in this research work. Some of the results of this research work are new

algorithms and methodologies for knowledge management.

PhD Thesis Proposal: G2.10

Title: “Automatic Document Indexation in Portuguese”

Keywords: Document Indexing, Natural Language Processing, Semantic Web

Supervisor: Prof. Paulo Gomes (


The main goal of this thesis proposal is to develop a system capable of indexing

documents in an ontology. The application area for this thesis is the domain of software

development. The main research contribution of this thesis is the development of

classification and indexation methodologies for documents related with software

engineering. Target documents are written in Portuguese and are related with software

development (manuals, reports, papers ...). The approach to be followed is based on a

repository of knowledge objects, which is structured by a ontology. The main

infrastructure for the storage and indexing of these objects is based on the languages

developed for the Semantic Web. New classification algorithms will also have to be

developed, so that they can cope with this document persity. Another major challenge

of this thesis is the correct disambiguation of document topics. This work is to be

integrated into ReBuilder ( – a software tool for reuse of UML


PhD Thesis Proposal: G2.11

Title: “Automatic Named Entity Recognition”

Keywords: Named Entity Recognition, Natural Language Processing, Case-Based


Supervisor: Prof. Paulo Gomes (


Automatic named entity recognition is the identification and classification of linguistic

expressions that refer to a specific entity. For example, “Coimbra University” is a named

entity, which comprises a sequence of words referring to the University of Coimbra.

These entities (represented by a sequence of words) possess their own linguistic

properties. Natural language processing systems and other applications dealing with

natural language text must be able to identify and classify these entities, in order to use

the associated semantics, which is very different from using the words inpidually.

There are several international contests that compare and evaluate systems for named

entity identification, for example MUC (Message Understanding Conference), later the

ACE (Automatic Content Extraction). More recently the first contest for Portuguese

HAREM ( This goal of this proposal is to

develop a named entity recognition system for the Portuguese language, and participate

in the HAREM contest. A Case-Based Reasoning (CBR) approach is suggested. CBR can

be defined as a way of reasoning based on past experiences, which from our point of

view CBR can be applied successfully to this problem. CBR also enables the integration

of other approaches making a good framework for solving this problem.

PhD Thesis Proposal: G2.12

Title: “Converting Text into UML Diagrams”

Keywords: Natural Language Processing, Case-Based Reasoning, Software Reuse,


Supervisor: Prof. Paulo Gomes (


Language is the most common form of communication between humans, both written

and spoken. Software developers are no exception, with natural language text being an

important part of the software specification documents. In the last decade, software

modeling languages, such as UML, have been developed and used in the specification of

software systems. The main idea of this proposal is to develop an approach for the

conversion of natural language software specifications into UML diagrams, both use

cases and class diagrams. This work is to be integrated into ReBuilder

( – a software tool for reuse of UML diagrams.

PhD Thesis Proposal: G2.13

Title: “Reusing Software Design Patterns”

Keywords: Software Design Patterns, UML, Case-Based Reasoning, Software Design


Supervisor: Prof. Paulo Gomes (


Software engineers and programmers deal with repeated problems and situations in the

course of software design. Software design patterns were developed to deal with this

type of situation, where the same abstract solution is used several times. Software

design patterns can be defined as a description of an abstract solution for a category of

design problems. One of the main advantages of patterns is design reusability. Another

main advantage is that the application of design patterns improves and makes software

maintenance easier – design for change. Software design patterns are described in

natural language, not having a formalization. This is due to the abstract level of the

design patterns, which makes the application of design patterns a human dependant


Existing approaches to pattern application using computer tools need the help and

guidance of a human designer. This is especially true in the selection of the design

pattern to apply. It is difficult to automate the identification of the context in which a

pattern can be applied. Human designers must also identify which are the objects

involved in the pattern application. The automation of this task opens the possibility of

CASE design tools to provide complete automation in applying design patterns. This

new functionality can help the software designer to improve systems, and do better

software reuse. Case-Based Reasoning can be defined as a way of reasoning based on

past experiences. From our point of view CBR can be applied successfully to the

automation of software design patterns. The aim of this thesis proposal is to develop an

approach that addresses this problem. The proposed CBR framework must be able to

select which pattern to apply to a target design problem generating a new design. It can

also learn new cases from the application of design patterns. This work is to be

integrated into ReBuilder ( – a software tool for reuse of UML


PhD Thesis Proposal: G2.14

Title: “Adaptation and Reuse Mechanisms for UML Diagrams”

Keywords: Case-Based Reasoning, Ontologies, Software Reuse, UML

Supervisor: Prof. Paulo Gomes (


Management and reuse of UML diagrams is an important aspect for any software

development company. The productivity increase that can be obtained from an effective

reuse of software development knowledge is crucial for market survival. This proposal

intends to develop new mechanisms for adaptation and reuse of UML diagrams. This

work is to be integrated into ReBuilder ( – a software tool for reuse of

UML diagrams. ReBuilder will work as a development platform, in which the developed

approaches will be integrated. This enables an easy testing and experimentation of the

developed approaches.

PhD Thesis Proposal: G2.15

Title: “Algorithms for Semantic Annotation of Positioning Information”

Keywords: locations, places, positioning systems, location based services

Supervisors: Prof. Francisco Câmara (

Prof. Carlos Bento (


Although we find today a myriad of positioning technologies (from the “common” GPS to

Wireless, GSM cell or Ultra Wide Band positioning algorithms), the interpretation of

what exactly position means is still cumbersome. For example, the information that “we

are at latitude 4,234W and longitude 30,123N” or “my current GSM cell ID is 1098” is

poor in terms of meaning for a user. Informations such as “I am in Morrocco”, “my

current location is in Coimbra” or “I am at work” are clearly richer and useful for a

wealth of applications. This is known as the “From Position to Place” problem

(Hightower, 2003) and is currently a hot topic in the Ubiquitous Computing area. The

primary goal of this PhD project is to study and develop methodologies that can

contribute to solving the problem just described. The approach expected will likely take

into account the user model, context and social interaction. This work is one of the

central topics of research of the Ubiquitous Systems Group of the AILab and has a high

potential of applicability in a range of state-of-the-art ubiquitous systems.

PhD Thesis Proposal: G2.16

Title: “Improvement of Algorithms for Indoor Location Supported on GSM and WiFi


Keywords: indoor location, GSM and WiFi signatures, location based services,

ubiquitous computing

Supervisors: Prof. Francisco Câmara (

Prof. Carlos Bento (


Information on location is a main concern for ubiquitous computing. Many applications

of ubiquitous systems depend on location to achieve their goal. Although the problem of

positioning a device or person outdoor is reasonably solved with GPS technologies, the

problem of indoor location is much more challenging. Various technologies exist for

indoor location, but in general they need a dedicated infrastructure. Some researchers

followed a promising approach that consists on using the information provided by GSM

and WiFi equipment on the level of the signals received from the base stations

(signature) to infer the current location of the equipment. At our group we have a

consolidated research on this direction using GSM and WiFi signatures to infer current

location using case-based reasoning techniques. The theme for this thesis focus on

improving these algorithms, in terms of precision and accuracy, following concurrent

inference approaches, not necessarily restricted to case-based reasoning.

PhD Thesis Proposal: G2.17

Title: “Context Modelling from Data on Ubiquitous Computational Devices”

Keywords: context awareness, context models, inference of context models,

proactivity, ubiquitous devices

Supervisors: Prof. Francisco Câmara (

Prof. Carlos Bento (


The growing availability of a wide range of sensors and communication services (such

as camera, gps, gsm or wireless) in devices such as PDAs or mobile phones is opening a

new world of possibilities in Ubiquitous Computing. The relatively new area of Context

Awareness is dedicated to model this information as well as to the design of possible

applications. In this PhD project, it is expected to explore the predictability associated

to the use of Ubiquitous devices, in other words, we believe it is possible to build user

models from use of context (and other) data and make inference about future

interactions from these models. This will bring intelligence and more ease of use to

Ubiquitous devices. This project is part of a central topic of research within the

Ubiquitous Group of the AILab: Context Awareness.

PhD Thesis Proposal: G2.18

Title: “Emotion Expression with Music”

Keywords: Affective computing, AI and Music, Computational Creativity

Supervisor: Prof. Amílcar Cardoso (


Music assumes a central role in driving the emotional experience in a perse range of

situations, from computer games to cinema, theatre and many other artistic,

entertainment and educational setups. Composing a soundtrack for such scenarios

requires the ability to align the expected emotional effect of the music and sound effects

with the intended emotional experience. The task is particularly hard when a predefined

fixed script doesn’t exist and the course of action is decided in a dynamic way, like in

most computer games and other interactive applications.

Roughly, soundtrack production includes the composition of a set of music sections

and the collection of sound and music effects, which constitute inputs for a production

phase where they are sequenced/mixed/blended. All these phases are typically

conducted by the soundtrack composer, although for computer games several

techniques exist to provide real-time sequencing solutions.

The ultimate goal of this work is to investigate computational approaches for the

problem of generating a soundtrack given a variable “emotion spectrum” as input.

Several branches of the problem may be explored, depending on the student

background, including but not limited to: music classification and retrieval by

emotional effect, generation of emotionally affecting music, music alignment (e.g.,

according to rhythm, dynamics, emotional effect).

G3: Simulation and Information Technologies in

Education and Training

PhD Thesis Proposal: G3.1

Title: “Alternative specification and visualization representations in initial

programming learning”

Keywords: computer science education; programming learning; alternative


Supervisor: Prof. Maria José Marcelino (


The main objective of this thesis is to study, propose and validate, on one hand, new

alternative forms of representation for algorithm and program specification and, on the

other, new alternative ways of algorithm and program visualization to support initial

programming learning and evaluate their impact on the quality of the achieved

students’ learning.

Initial programming learning is quite hard for the majority of students. It is usually

supported by one (or more) of three typical modes of algorithm/program representation:

pseudo code, flowcharts and code in a specific programming language. In what

concerns algorithm/program visualization several approaches have also been used:

variable log, debugging helps, simulated algorithm/program animation. Each student

has her/his own preferences about these representation and visualization metaphors.

There are particular types of programming problems that are mandatory in initial

programming learning and for which typical students’ solutions (good as well as

erroneous) have been identified. We believe that, although final programs must be

coded in one particular programming language, during initial learning stages many

programming students could benefit from the study and implementation of perse

alternative solution representations as well as visualizations, especially if they are more

close to students’ previous experience and context.

In the scope of this thesis student preferential alternative representations both at the

level of algorithm and program specification and of results visualization will be

identified and evaluated. After new forms will be developed and proposed in order to

cope with students more commonly found difficulties that traditional approaches can

not deal with. These new forms will be afterwards the object of thorough evaluation.

PhD Thesis Proposal: G3.2

Title: “Learning communities to support initial programming learning”

Keywords: computer science education; programming learning; learning


Supervisor: Prof. António José Mendes (


Initial programming learning is known as a hard task to many novice students at

college level, leading to high failure and drop out in many courses. Many reasons can be

found for this scenario and several approaches have been proposed to facilitate

students’ learning. However, problems continue to exist and it is necessary to

investigate new solutions that may help programming students and teachers.

Learning communities’ concept exists for some time. It has been presented as a way to

create rich learning contexts where teachers, students and other people, namely

experts, can coexist and collaborate in the production of knowledge, consequently

leading to learning enhancement.

This thesis proposal includes first the study of representative learning communities’

successful cases and characteristics, and after the study, proposal and creation of a

learning communities support platform specially adapted to the needs of students

during programming learning. The platform and its utilization will undergo a full

evaluation, in order to access its success in promoting programming learning. It is

expected that this platform includes innovative characteristics, for example the

inclusion of virtual members that may interact with real members when necessary and

specially tailored features and tools that may improve the quality of programming


PhD Thesis Proposal: G3.3

Title: “Problem solving patterns and remediation strategies in programming learning”

Keywords: computer science education; programming learning; learning


Supervisors: Prof. António José Mendes (

Prof. Maria José Marcelino (


Initial programming learning is known as a difficult task to many novice students at

college level. In those courses it is common to use a set of typical problems to introduce

students to basic programming concepts and also to stimulate them to develop their

first programs and programming skills. This work is essential, since it should allow

beginners to develop the basic programming problem-solving skills necessary to be

further developed and refined later. So, this first learning stage is crucial to students’

performance in all programming related courses.

This thesis proposal includes a study about the different ways students approach these

typical basic problems, leading to the identification of common problem solving

patterns. Some of these patterns will be adequate, while others will not lead to the

development of correct solutions, being considered wrong or erroneous patterns that

must be identified and corrected in student’s strategies knowledge. Based on this

information, the thesis main objective will be the proposal, implementation and

evaluation of methods and/or tools that may identify novice students’ strategies,

categorize typical wrong patterns and common errors, and interact with them giving

personalized remediation feedback when necessary. The forms of this feedback must

also be studied, so that it becomes effective not only to help students to solve the

current problem, but mainly to help them to develop better approaches that may lead to

correct solutions in later problems and learning stages.

PhD Thesis Proposal: G3.4

Title: “Using Design Patterns in Modeling and Simulation”

Keywords: Model Reuse, Design Patterns, Parallel & Distributed Simulation

Supervisor: Prof. Fernando Barros (


Software Design Patterns (SDPs) are a widely used technique in software development.

Time-based SDPs have been developed for building real-time software and its use is a

promising approach to build modeling and simulation software. This proposal intends

to develop new SDPs to help the development of reusable simulation models and

reusable simulation kernels able to deal with both conservative and optimistic parallel

& distributed approaches.

PhD Thesis Proposal: G3.5

Title: “Modeling and Simulation of Adaptive Sampling Systems”

Keywords: Numerical Simulation, Sampled-Based Systems

Supervisor: Prof. Fernando Barros (


Adaptive step-size numerical methods permits to improve simulation performance while

yielding the same accuracy of fixed step-size methods. The use of asynchronous

numerical solvers permits to concentrate computation power in most demanding

models enabling larger systems to be represented. Digital Control and Digital signal

processing areas are currently exploiting multirate sampling and adaptive sampling

techniques as a more efficient alternative to conventional fixed sampling rate

approaches. In this proposal, we intend to develop new algorithms based on adaptive

sampling and make their application to numerical solvers, event detectors, and signal

and control systems.

G4: Adaptive Computation

PhD Thesis Proposal: G4.1

Title: “Intelligent Data Mining in GRID Technology”

Keywords: Machine Learning techniques such as Neural Networks, Support Vector

Machines and Clustering in data mining, prediction and recognition

Supervisor: Prof. Bernardete Ribeiro (


Grid technology emerged from distributed computing with the goal of generating

processing power for meeting users’ needs. To increase computing power, computing

resources are gathered across physical places. The idea was to match the unused

computing cycles with the needs created by applications and other users. This notion is

now a ubiquitous solution practised all around the world. It ensures continuous

computing availability despite scheduled maintenance, power outages, and unexpected

failures. The main idea of research is to prepare selected data mining algorithms,

preferably those originating in soft-computing, to be able to run in distributed

environments like clusters and global grids. Different techniques will be used, from

architectural parallelization of the models, data parallelization of the models and

parallel parameter-search methods for sequential models. An analysis will be given

focusing on a suitability of the techniques and particular distributed environments in

combination with the implemented data mining models. The implementations of

methods on distributed computing resources will be tested on wide variety of data bases

emerging from industry, medicine and internet in order to investigate their efficiency

and robustness.

PhD Thesis Proposal: G4.2

Title: “Learning from Side-Information and From Heterogeneous Data Sources”

Keywords: Machine Learning techniques such as Neural networks, Support vector

Machines and Clustering in data mining, prediction and recognition

Supervisor: Prof. Bernardete Ribeiro (


In the last decade we have witnessed an impressive growth of the on-line information,

mostly available through the Web, the intranets and other sources. It is estimated that

the Information density doubles every 12 to 15 months but the capacity of reading and

analysing it remains constant. This is not only due to better technology that allows for

fast acquisition, but also due to faster computer technology that allows exploiting the

data in machine learning tasks. It seems however that the growth of the data is more

explosive than the boost in computing power, and this evolution is improbable to

change when Moore’s law will saturate once the limits of electronics are approached.

Then only the algorithmics can make further speed-ups possible. Along with the

explosive growth of data availability, an increasing persity of the data types can be

observed. The questions how to deal with this heterogeneity and how to weight the

importance of different sources of data and information remains to be solved. Learning

from heterogeneous sources of data, and learning in semi-supervised learning settings

is the main theme of research. Applications of these settings abound, in the broad

domain of bioinformatics (mainly learning the heterogeneous information), machine

vision (mainly learning from side-information, for image and video segmentation), many

generic classification problems in the field of text mining and information extraction

and many others.

PhD Thesis Proposal: G4.3

Title: “Assigning Confidence Score in Page Ranking for Intelligent Web Search”

Keywords: Ranking, Text Mining, Machine Learning

Supervisor: Prof. Bernardete Ribeiro (


Web has become the main centre of research around the globe. Users face themselves

with an overload of data when a simple search is fed into Google or a similar web search

engine. A recurrent problem is to unveil the desired information from the wealth of

available search results. Ranking, which can be achieved by providing a meaningful

score for each classification decision, is important in most practical settings. Text

retrieval systems typically produce a ranking of documents and let a user decide how

far down that ranking to go. Several Learning Machine techniques allow the definition of

scores or confidences coupled with their classification decisions. The main idea of the

current proposal is to explore ranking systems based on Bayesian approaches to the

web learning problem that allow a refinement of existing systems. Moreover

classification systems can be improved by enriching information and information

representation with external background information, such as, ontology-related data.

Evaluation can be done on benchmarks, but also with real users defining the goals and

assessing the final results, including score changes in final ranking.

PhD Thesis Proposal: G4.4

Title: “Homecare Diagnosis of Pediatric Obstructive Sleep Apnea”

Keywords: homecare; obstructive sleep apnea; reduction of complexity; biosignals

processing; computational intelligence; automatic diagnosis.

Supervisor: Prof. Jorge Henriques (


The main goal of this work is to investigate homecare solutions that could stratify

normal and apnea events for diagnostic purposes in children suspected for the presence

of obstructive sleep apnea syndrome.

Obstructive sleep apnea syndrome (OSAS) is a condition whereby recurrent episodes of

airway obstruction are associated with asphyxia and arousal from sleep. It is estimated

to affect between 1 and 3% of young children and its potential consequences include

excessive daytime somnolence, behavioral disturbances and learning deficits,

pulmonary and systemic hypertension, and growth impairment. The currently accepted

method for diagnosis of OSAS is overnight polysomnography (PSG), done in sleep

laboratories, where multiple signals are collected by means of face mask, scalp

electrodes, chest bands etc. It monitors different activities, including brain waves (EEG),

eye movement (EOG), muscle activity (EMG), heartbeat (ECG), blood oxygen levels and

respiration. However, the diagnosis of OSAS from these huge collection of data is

sometimes not straightforward to clinicians, since major relations between features and

consequents are most often very high dimensional, non-linear and complex. These

requirements impose the necessity of innovative signal processing techniques and

computational intelligent data interpretation methodologies, such as neural networks

and fuzzy systems. One of the main goal of this work is to provide clinicians with the

tools that can help them in their diagnosis.

Although PSG is considered the gold standard for diagnosis of OSAS, given the relatively

high medical costs associated with such tests and the insufficiency number of pediatric

sleep laboratories, PSG is not readily accessible to children in all geographic areas.

Thus, analysis of the validity of alternative diagnostic approaches should be done, even

assuming their accuracy is suboptimal. The second goal of this work points in this

direction. It aims investigating the viability to reduce the number and complexity of

measurements in order to make possible the stratification of OSAS in children natural


PhD Thesis Proposal: G4.5

Title: “Architectures and algorithms for real-time learning in interpretable neurofuzzy


Keywords: on-line learning; neuro-fuzzy systems; interpretability; machine learning

Supervisor: Prof. António Dourado (


The development of fuzzy rules to knowledge extraction from data acquired in real time

needs new recursive techniques for clustering to produce well designed fuzzy-systems.

For Takagi –Sugeno-Kang (TSK) systems this applies mainly to the antecedents, while

for Mamdani type it applies both for the antecedents and consequents fuzzy sets. To

increment pos-interpretability of the fuzzy rules, such that some semantic may be

deduced from the rules, pruning techniques should be developed to allow a humaninterpretable

labelling of the fuzzy sets in the antecedents and consequents of the rules.

For this purpose convenient similarity measures between fuzzy sets and techniques for

merging fuzzy rules should be developed and applied. The applications envisaged are in

industrial processes and medical fields.

PhD Thesis Proposal: G4.6

Title: “Intelligent Monitoring of Industrial Processes with application to a Refinery”

Keywords: intelligent process monitoring; multidimensional scaling; computational

intelligence; clustering

Supervisor: Prof. António Dourado (


High dimensional data in industrial complexes can be profitably used for advanced

process monitoring if it is reduced to a dimension where human interpretability is easily

verified. Multidimensional scaling may be used to reduce it to two or three dimensions if

appropriate measures of similarity/dissimilarity are developed. The measures express

the distance between attributes, the essence of the information, and a similar difference

should be guaranteed in the reduced space in order to preserve the informative content

of the data. Research of appropriate measures and reduction method is needed.

In the reduced space, classification of the actual operating point should be dome

through appropriate recursive clustering and pattern recognition techniques. The

classification is intended to evidence clearly the quality level of the actual and past

operating points in such a way that the human operator finds in it a useful decision

support system for the daily operation of the mill. The work has as applications the

process of visbreaker in the Galp Sines Refinery.

G5: Dependable Systems

PhD Thesis Proposal: G5.1

Title: “Grid Computing Support for Distributed Unreliable Networks”

Keywords: Grid computing, parallel programming, distributed computing, desktop


Supervisor: Prof. Paulo Marques (


Nowadays there is a huge need for parallel computing. Researchers in areas like

biology, physics, computer networks, among others, need to perform a huge number of

computer experiments in order to gather results. At the same time, there is an increase

need for on-demand-experiments (e.g. “I want to know this result now!”). Researchers

what to quickly run an experiment, which may involve thousands of calculations, in

order to know how to set “yet another parameter” or which part of a search space to

explore. They are not willing to wait days or weeks for getting simple answers that just

guide the direction of their research – they want them at the touch of a button.

Although many frameworks for parallel computing exist (e.g. MPI, PVM, OpenMP), they

are typically thought for cluster computing. This raises a problem because not all

researchers have a readily available cluster for performing experiments. Even when they

do have access to a cluster, in most cases, the relative size of the cluster is small to the

number of researchers wanting to use the resources, which increases the turn-around

time for running experiments. At the same time, cluster environments are not so

compatible with the increasing requests from researchers for on-demand-computing.

The alternative is trying to run scientific applications in non-dedicated machines, also

known as desktop grids (e.g. classroom/office PCs), which abound in organizations.

But, in that case, the use of MPI and similar frameworks is very inappropriate (e.g. the

fault-model of MPI implies that if one process crashes, the whole computation dies –

which is incompatible with the unreliability of those computers!).

Although many frameworks have been developed for running parallel applications on

non-dedicated machines (e.g. Condor, BOINC, Alchemi, etc.), in order to coupe with the

unreliability of the machines, they typically don’t allow communication between nodes

and the computations are based on bounded work units assigned to the nodes. For

instance, using most of these frameworks, it’s quite difficult to write grid-based

algorithms (e.g. for calculating the air flow that passes through an airplane wing) or a

global back-tracking algorithm (e.g. for optimizing a path throw a network).

Finally, researchers in other areas than computer science are now coming to terms with

more modern programming environments and easier to use computer languages. For

instance, Python and Numeric Python, as well as Matlab and Mathematica, are quite

popular with biologists, physicists and even social scientists. It’s now time to move

parallel programming beyond C and Fortran.

Clearly, research is needed on programming models and infrastructures that allow

parallel programs to run on unreliable distributed networks, and at the same time

allows them to be written in modern, easy-to-use and productive computer languages.


This PhD dissertation will consist in investigating, implementing and evaluating a

programming model that allows parallel programs to be easily written and reliably

execute on desktop grids. It is also a specific objective of the thesis to deviate from the

traditional message passing paradigm and from the remote method invocation models

of creating distributed applications.

The framework to be developed will address questions as (but not necessarily all): a)

programming easiness; b) global reliability of the computation in presence of failures; c)

access to stable storage for reading and writing results; d) security; e) deployment and

monitoring; f) visualization.

In this context, some technologies will be interesting to consider and explore:

distributed hash tables; P2P routing and service discover; erasure codes; distributed

map/reduce programming models; self-organization; consensus; group membership

and election algorithms; mobile code, among others.

PhD Thesis Proposal: G5.2

Title: “Sensor Networks for Space Exploration”

Keywords: Sensor networks, Ad-hoc networking, Space exploration

Supervisor: Prof. Paulo Marques (


Sensor networks are currently a hot topic in distributed systems research. A sensor

network consists in tens or hundreds of inexpensive sensing devices, typically not much

larger than a coin, that are able to gather information from the environment, coordinate

among themselves, and relay that information to a remote location. This type of systems

has a huge number of application areas, like earth observation, environment

monitoring, security, medical healthcare, among others. Typical deployment scenarios

include scattering devices through a forest for detecting the start of wildfires; placing

devices throughout a river basin for detecting pollutant dumping; or even equip cars

with such devices for detecting and warn about immediate collision danger.

One extremely interesting application scenario for sensor nets is space exploration. It is

quite easy to imagine that deploying hundreds of sensors over some kilometers while a

space probe is descending to Mars can be quite beneficial. Instead of being limited to

gather data where the spacecraft lands or where their rovers can move, truly distributed

data acquisition can take place. The same thing applies, for instance, in orbit for

gathering data and performing distributed experiments in Earth’s upper atmosphere, or

even to gather data from orbiting probes (the current largest European satellite is the

size of a TIR truck!).

Using sensor networks in space presents unique and challenging problems. Space is

quite a harsh place: radiation abounds causing software and hardware failures,

temperature is typically well below zero, electromagnetic interference makes radio links

quite unreliable. Since normally these devices are small, cheap and disposable,

typically they are quite limited in terms of computational power, energy and

transmission bandwidth. This makes engineering this type of networks for reliability

difficult, especially for space applications.


This PhD dissertation will consist in investigating, implementing and evaluating

algorithms for fault-tolerance in sensor networks for harsh, Byzantine, environments,

for space exploration. In particular, the thesis will focus on exploring the spectrum of

possibilities for achieving different degrees of reliability in computationally limited

devices, when this reliability comes at cost of spending energy and having to

communicate with other sensing nodes.

Currently, it is envisioned that this work will be carried out in the context of a research

project where other partners with develop the sensing platform hardware and also

provide a realistic context for fault injection and testing, possibly in a collaboration with

the European Space Agency (to be confirmed).

PhD Thesis Proposal: G5.3

Title: “Self-Healing Techniques for Application Servers”

Keywords: Autonomic computing, self-healing, software aging, rejuvenation,


Supervisor: Prof. Luís Moura e Silva (


One of the actual big-challenges of the computer industry is to deal with the complexity

of the systems. The Autonomic Computing initiative driven by IBM defined the following

functional areas as the cornerstone of an autonomic system: self-configuration, selfhealing

, self-optimization and self-protection. The self-healing property refers to the

automatic prediction and discovery of potential failures and the automatic correction to

possibly avoid downtime of the computer system. This leads to the vision of “computers

that heal themselves” and do not depend so much on a system manager to take care of.

While there has been some interesting work on self-healing techniques for missioncritical

systems there is a long way to achieve that goal in commercial off-the-shelve

(COTS) servers running Apache/Linux, Tomcat, JBoss, Microsoft .Net. The purpose of

this PhD is to study and propose low-cost and highly-effective self-healing techniques

for these application servers.

One of the potential causes of failures in 24x7 server systems is the occurrence of

software aging. The phenomena should be studied in detail together with high-level

techniques for application-level failure detection. Some mathematical techniques should

be applied to detect software aging and to forecast the potential time for the failure of

the server system. When the aging is detected the server system should apply proactively

a software rejuvenation technique to avoid the potential crash and to keep the

service up and running. Techniques for micro-rejuvenation should be further studied to

avoid downtime of the server. The final result of this PhD should be a set of software

artifacts and the refinement of data analysis techniques to apply in COTS application

servers in order to predict failures and software aging in advance and to apply some

corrective action to avoid a server crash.


This PhD will comprehend the following initial tasks: (1) State-of-the-art about

Autonomic Systems, Self-healing, Software Aging, Software Rejuvenation, Microrebooting

and Dependability Benchmarking; (2) Machine learning techniques to forecast

the failures and software aging; (4) Application-level techniques for failure prediction

and early detection; (5) Micro-rejuvenation techniques for application servers; (6)

Extension of the techniques SOA-based and N-tier applications; (7) Dependability

benchmarking; (8) Implementation of an experimental framework; (9) Analysis of

experimental results.

PhD Thesis Proposal: G5.4

Title: “Dependable and Self-Managed VoIP Infrastructures”

Keywords: Autonomic computing, Self-healing, Dependability, Peer-to-Peer, VoIP,


Supervisor: Prof. Luís Moura e Silva (


Peer-to-peer techniques have been widely applied for decentralized file-sharing in the

internet, distributed computing, content distribution and to support applications like

Voice-over-IP. The most popular example is Skype that is based on a supernode-based

P2P network. Since in those P2P networks some of the server-based services can be

provided by the client machines there is mandatory need to provide the P2P

infrastructure with techniques for self-configuring, self-healing and self-management,

in the line of Autonomic Computing systems.

The goal of this thesis is to devise and study some software techniques to enhance the

dependability and autonomic computing capabilities of VoIP infrastructures. In a first

step, the goal is to provide fault-tolerant mechanisms for VoIP servers, by studying the

reliability of staged-event servers, the use of software fail-over mechanisms, studying

failure-analysis and prediction to anticipate the occurrence of software aging inside a

server, the use of micro-rebooting of service components and complementary

techniques to enhance the self-healing capabilities of a VoIP server. In the second phase

of the thesis, there should be interesting to study the usage of high-level reconfiguration

techniques that will require architectural changes in the VoIP infrastructure and some

reconfigurable usage of SIP/RTP protocols with the ultimate goal of providing a higher

QOS and transparency of failures for the end-user of the application. The result should

be a highly-reliable P2P infrastructure that can be used in to enhance the QOS of VoIP

applications like Skype.


This PhD will comprehend the following initial tasks: (1) State-of-the-art about Peer-to-

Peer networks, VoIP infrastructures, STUN/TURN servers, SIP and RTP protocols, Selfhealing

techniques, Staged-event servers, Load-balancing and Server Reconfiguration;

(2) Application-level techniques for failure prediction and early detection; (3) Prediction

of server availability; (4) Software techniques for load-balancing (wackamole project); (5)

Study of self-healing techniques for stagged-event servers (SEDA project); (6) Microrebooting

techniques for VoIP servers; (7) Reconfiguration techniques for VoIP servers;

(8) Support for protocol reconfiguration; (9) Construction of the framework and results;

(10) Data analysis.

PhD Thesis Proposal: G5.5

Title: “Wired self-emerging ad hoc network”

Keywords: peer-to-peer, ad hoc, distributed hash table.

Supervisor: Prof. Filipe Araújo (


In recent years computer communication is departing from the client-server

architecture and moving increasingly more toward a peer-to-peer architecture. One

aspect that characterizes this kind of interaction is the opportunistic participation of

many of the peers: they connect to the network for only a few moments, just to discover

and download (or not) what they are looking for and then they disconnect. Interestingly,

mobility and battery exhaustion can reproduce this same trend in wireless ad hoc

networks, comprised of devices that use radio broadcast to communicate.

While wired peer-to-peer and wireless ad hoc networks share a number of common

features, like self-configuration, decentralized and fault-tolerant operation, they have

however an important difference: wired peer-to-peer networks run as overlay networks

on top of the IP infrastructure. This raises the following question: can we take the

paradigms from wireless networks and create IP-less self-organizing wired networks?

Our goal is to plug-in and out new devices or even entire networks from the wired

infrastructure in a scalable and decentralized way and without the need for any a priori

configuration. In contrast, current IP networks can only scale, because they are highly

hierarchical and they require a considerable amount of human assistance. As a

consequence they are often highly congested, expensive to maintain and unreliable.

The fundamental difference between the solution we seek and wireless ad hoc networks

has to do with available bandwidth. In fact, the most important constraint that makes

collection of routing information so challenging and that limits the pace of change of

topology in wireless ad hoc networks is the (lack of) available bandwidth. Available

bandwidth is a very scarce resource, because it is shared among all the nodes. This

makes it theoretically impossible to create a wireless ad hoc network that scales with

the number of nodes. As a consequence, algorithms for wireless ad hoc networks are

often localized or have, at most, very limited information of distant regions of the

network. This is very unlike the situation in wired networks: for the same pace of

topological change, the supply of bandwidth is not shared and it is much larger. This

paves the way for better and more powerful solutions, which, we believe are largely

unexplored in literature.


This PhD work encompasses the following tasks: (a) review of the state-of-the-art; (b)

design of the architecture; (c) evaluation of the scalability of the architecture (admissible

number of nodes and topological changes versus available bandwidth); (d) exact and

range-based lookup algorithms that leverage on previous work on distributed hash

tables and peer-to-peer file-sharing applications; (e) design of an interconnection

infrastructure, to connect islands of wired ad hoc networks with the IP network.

PhD Thesis Proposal: G5.6

Title: “Fast Moving Wireless Ad Hoc Nodes”

Keywords: peer-to-peer, wireless ad hoc, wireless infrastructured, Wireless Access

for the Vehicular Environment (WAVE).

Supervisors: Prof. Filipe Araújo (

Prof. Luís Moura e Silva (


In recent years we have assisted to an increasing interest in wireless networks. While

most current applications seem to be set for sensor networks, we can foresee many

other applications for mobile ad hoc or mixed ad hoc/infrastructured networks, where

nodes are mainly mobile and communication goes beyond simple data gathering of a

sensor network. For instance, applications can enhance the behavior of a crowd by

providing additional services to users holding mobile wireless devices, like search for a

given person that is momentarily lost, search for a person that matches some social

interests, exchange of perse information, of a product, etc. Another context that is

extremely promising is that of a spontaneous network formed by cars in a road,

enriched with some infrastructure that is able to provide traffic, weather and other

information to drivers. By letting cars share their information, it may be possible to save

significant costs in the infrastructure and still considerable improve the quality and

quantity of information.

In this PhD work we want to leverage on some existing routing algorithms for wireless

ad hoc networks and make them work on particular environments with specific

patterns of mobility. Interestingly, in networks with a high degree of mobility it is often

possible to increase the speed of the flow of information, because mobility creates more

opportunities to exchange this information. In particular, we want to consider a

scenario where the network is comprised of fast-moving cars equipped with IEEE

802.11p network adapters (Wireless Access for the Vehicular Environment – WAVE).

This is a case where part of the information is created and sent to some points of the

infrastructure through a chain of nodes, while at the same type, cars can also introduce

new information in the network, for instance by signaling their presence to cars in

front, in the rear or to cars traveling in the opposite direction. In particular, the

information shared with cars going in the opposite direction first and then with the base

stations located along the road is of paramount utility as this has the potential to

propagate very accurate data of traffic jams or accidents at virtually no cost. We expect

to use similar principles to more complex but slower-moving networks comprised of

people with handheld or other wireless devices walking in crowds.


This PhD work encompasses the following tasks: (a) review of the state-of-the-art; (b)

design of routing algorithms for environments with high mobility; (c) design of

information-sharing applications for environments with high mobility; (d) simulation in

real environments.

PhD Thesis Proposal: G5.7

Title: “Detecting Software Aging in Database Servers”

Keywords: Software aging, software rejuvenation, autonomic computing, database

management systems, dependability benchmarking

Supervisors: Prof. Marco Vieira (

Prof. Luís Moura e Silva (


One of the main problems in software systems that have some complexity is the

problem of software aging, a phenomenon that is observed in long-running applications

where the execution of the software degrades over time leading to expensive hangs

and/or crash failures. Software aging is not only a problem for desktop operating

systems: it has been observed in telecommunication systems, web-servers, enterprise

clusters, OLTP systems, spacecraft systems and safety-critical systems.

Software aging happens due to the exhaustion of systems resources, like memory-leaks,

unreleased locks, non-terminated threads, shared-memory pool latching, storage

fragmentation, data corruption and accumulation of numerical errors. There are several

commercial tools that help to identify some sources of memory-leaks in the software

during the development phase. However, not all the faults can be avoided and those

tools cannot work in third-party software modules when there is no access to the

source-code. This means that existing production systems have to deal with the

problem of software aging.

The natural procedure to combat software aging is to apply the well-known technique of

software rejuvenation. Basically, there are two basic rejuvenation policies: time-based

and prediction-based rejuvenation. The first applies a rejuvenation action periodically,

while the second makes use of predictive techniques to forecast the occurrence of

software aging and apply the action of rejuvenation strictly only when necessary.

The goal of this PhD Thesis is to study the phenomena of software aging in commercial

database engines, to devise and implement some techniques to collect vital information

from the engine and to forecast the occurrence of aging or potential anomalies. With

this knowledge the database engine can apply a controlled action of rejuvenation to

avoid a crash or a partial failure of its system. The ultimate goal is to improve the

autonomic computing capabilities of a database engine, mainly when subjected to high

workload and stress-load from the client applications.


The PhD work will comprehend the following initial tasks: (a) overview of the state-ofthe-

art about software aging, rejuvenation and dependability benchmarking; (b)

development of a tool for dependability benchmarking of database engines; (c)

development of a workload and stress-load tool for databases; (d) infrastructure of

probes (using Ganglia) to collect vital information from a database engine; (e)

development of mathematical techniques to forecast the occurrence of software aging

(time-series analysis, data-mining, machine-learning, neural-networks);(f) experimental

study. Analysis of results; (g) adaptation of rejuvenation techniques for database

engines; (h) writing of papers;

PhD Thesis Proposal: G5.8

Title: “Security benchmarking of COTS components”

Keywords: Software reliability, Security benchmarking, Experimental evaluation,

Dependability benchmarking

Supervisors: Prof. Henrique Madeira (]

Prof. João Durães (


One of the main problems in software systems is the vulnerability to malicious attacks.

Complex systems and systems that have high degree of interaction with other systems

or users are more prone to be successfully attacked. The consequences of a successful

attack are potentially very severe and may include the theft of critical-mission

information and trade secrets. Given the pervasive nature of software systems in

modern society, the issue of security and testing for vulnerability to attacks is an

important research area.

The vulnerability of software systems are caused by several factors. Two of these factors

are the integration of third-party off-the-shelf components to build larger components,

and bad programming practices. The integration of third-party generic-purpose

components may introduce vulnerabilities in the larger system due to interface

mismatch between the components that may be exploited for attacks. Bad programming

practices may lead to weaknesses that may be exploited by tailored user inputs. Testing

a system or component against malicious attacks is a difficult problem and is currently

an open research area. Testing for vulnerabilities to malicious attacks can not be

performed as traditional testing because there is no previous knowledge about the

nature of the attacks. However, these attacks follow a logic based on exploiting possible

weaknesses inside the software and this logic can be used to forecast the existence of


The goal of this PhD Thesis is to study the phenomena of attacks to software systems

and devise a methodology to assess the vulnerability to these attacks. This includes the

proposal of experimental techniques to test systems and components following the logic

of dependability benchmarking and experimental evaluation. It is expected that at the

conclusion of the Thesis there is case study with practical results of assessment

security and vulnerability forecasting for comparison purposes. Web servers are

suggested as on of possible case studies. Fault injection techniques and robustness

testing techniques should be considered as enabling techniques for the purposes of the



The PhD work will comprehend the following initial tasks:

(a) overview of the state-of-the-art about software security, software vulnerabilities,

software defects, validation methods, robustness testing and dependability


(b) development of methods and tools for analysis of the identification of patterns

related to vulnerabilities and the automated testing of the possible vulnerabilities (case

studies include web-servers)

(c) proposal of generic test methodologies for evaluation of software vulnerability to

malicious attacks based on software defects and program pattern analysis for system

comparison purposes;

(d) proposal of formal methodologies for experimental assessment of security and

vulnerability forecasting on third-party (black-box) software components;

(e) development of experimental infrastructure of tools for practical demonstration of the

above to real systems (case studies include web-servers);

(f) experimental study. Analysis of results;

(g) writing of papers;

Target conferences to publish papers:

- Dependable Systems and Networks (DSN)

- International Conference on COTS-Based Software System (ICCBSS)

- International Conference on Computer Safety, Reliability and Security (SAFECOMP)

- International Symposium on Software Reliability Engineering (ISSRE)

G6: Communications and Telematics

PhD Thesis Proposal: G6.1

Title: “Security in Wireless Sensor Networks”

Keywords: Sensor Networks, Security, Mobility

Supervisor: Prof. Jorge Sá Silva (


Although in the last years, we witnessed the increase in processing capabilities and in

bandwidth of communication systems, several researchers consider that, in a near

future, an inversion of trends will occur. These new computational systems will not

consist of devices with higher processing power, but simply networks of sensors.

Wireless Sensor Networks (WSNs) are composed by a high number of nodes, each one

equipped with a microprocessor, low memory and a basic communication system.

The integration of WSNs in the Internet will revolutionize several concepts and it will

require new paradigms. More recently, a new group was created at the IETF, the

6LoWPAN, which is responsible to produce problem statements, assumptions and goals

for network elements with restricted requirements such as limited power, in which the

WSNs can be included.

However, security mechanisms for these networks are scarce and inefficient. The

research work of this PhD program will comprise the study and the proposal of new

models for the security of these new Internet elements. This is particularly important in

mobile environments, as the research community generally assumed sensors as static

nodes. The new WSN protocols should consider security issues to protect against

eavesdropping and malicious behavior.

PhD Thesis Proposal: G6.2

Title: “Multicast in Next Generation Networks”

Keywords: Multicast, 4G Networks, Mobility

Supervisor: Prof. Jorge Sá Silva (


Multicast communication in the Internet has deserved an increasing attention in the

last few years. Nowadays, there are more and more applications that require

communication systems with multipoint communication capabilities. Multicast

communication reduces both the time it takes to send data to a large set of receivers

and the amount of network resources required to deliver such data.

The appearance of such new applications, with multicast requirements, evidenced the

need of truly multicast protocols in the IP layer. However, traditional solutions were too

complex to implement and only few network equipments support them.

The purpose of this research work is to study new multicast paradigms that offer simple

solutions for the Next Generation Internet, that will include elements such laptops,

PDAs, mobile phone and sensors. To date, the design goals for multicast protocols in

wired or wireless environments haven’t included sensor nodes. However, this will be

crucial as sensors will be preponderant elements in the Next Generation Internet.

PhD Thesis Proposal: G6.3

Title: “Routing for Resilience in Ambient Networks”

Keywords: Routing, resilience, ambient networks

Supervisors: Prof. Edmundo Monteiro (

Prof. Marilia Curado (


The new types of applications and technologies used for nowadays communication

among users, and the persity of types of users have shown that traditional routing

paradigms are not capable of coping with these recent realities. Therefore, the role of

routing in IP networks has shifted from single shortest path routing to multiple path

routing subject to multiple constraints such as Quality of Service requirements and

fault tolerance. Moreover, traditional routing protocols have several problems

concerning routing information distribution which compromises routing decisions.

Namely, routing decisions that are based on inaccurate information due to bad routing

configurations either caused by faulty or malicious actions will cause severe disruption

in the service that the network should provide.

These are particularly important issues in networks that involve different types of

communication devices and media, such as happens in ambient networks. Ambient

networks pose an additional challenge to routing protocols, since network composition

changes very often when compared to traditional IP networks, and networks are

expected to cooperate among each other on-demand without relying on previous

configuration. Moreover, associated with the dynamic structure of ambient networks,

traffic patterns in ambient networks also change very often due to the composition and

decomposition of network structure.

The work proposed for this thesis aims at studying the existing vulnerabilities of actual

routing protocols used in the Internet and to propose a resilient routing scheme that

overcomes these weaknesses in order to improve network availability and survivability.

The work will comprise the study of the state of the art of routing protocols for

resilience, the characteristics of ambient networks, and the proposal of enhancements

to existent routing schemes in order to improve the contribution of the routing protocol

for the resilience of ambient networks.

The research work of the PhD candidate will be included in the European Union

Integrated Project WEIRD (WiMAX Extension to Isolated Research Data networks -

PhD Thesis Proposal: G6.4

Title: “Community networks connectivity and service guarantees”

Keywords: Mobility, nomadicity, community networks

Supervisor: Prof. Fernando Boavida (


A number of recent technological developments have enabled the formation of wireless

community-wide local area networks. Dispersed users (residents or moving users)

within the boundaries of a geographical region (neighbourhood or municipality) form a

heterogeneous network and enjoy network services such as Internet connectivity. This

environment, named Community Networks, is well suited for both traditional Internet

access and the deployment of peer-to-peer services.

Achieving and retaining connectivity in this highly heterogeneous environment is a

major issue. Although the technology advances in wireless networks are fairly mature,

one further step in the management of Community Networks is to provide mobility and

nomadicity support. Nomadicity allows connectivity everywhere while mobility includes

the maintenance of the connections and sessions while the node is moving from one

place to another. Mobility and nomadicity in community and home networks still place

several challenges, as these environments are highly heterogeneous.

Seamless handover between different layer two technologies is still a challenge.

Seamless multimedia content distribution to the home may include several network

technologies, such as WLAN, power-line, GRPS or UMTS. Thus, the inter-layer issues

involved are complex and a lot of work is required to match MIP6 with them in order to

provide seamless mobility for multimedia information. The ability to support sessions

on multiple access networks is another open issue. These issues will be the central

concern of the proposed PhD work.

The research work of the PhD candidate will be included in the European Union IST

FP6 CONTENT Network of Excellence (CONTENT – Content Networks and Services for

Home Users, and will be carried out in close cooperation

with a foreign institution.

G7: Databases

PhD Thesis Proposal: G7.1

Title: “The Optimization Problem (OP) for QoS-compliance in Systems Engineering”

Keywords: QoS-Broker, Distributed Programming, SLAs, Optimization

Supervisor: Prof. Pedro Furtado (


The capacity to monitor and control QoS-parameters and their lifecycle is an important

aspect for today’s mature systems and software. Quality-of-Service and automatic

adaptation are also at the centre of most current developments in systems and

software. Another frequent issue in today’s parallel, distributed, mobile and in generic

networked systems is the optimization of content placement and replication and its

applications. Our group has been working on both issues and their relationship and

has two projects running. We have interesting and innovative proposals that make very

good PhD works. We also have some basic algorithmic and solver pilot prototypes for

(OP) and for QoS functionality as starting points and a detailed knowledge of current

and promising work. Besides these issues, the proposal is an opportunity for the PhD

candidate to learn and work with technologies such as condor, java and solver software,

and to work on exciting algorithms within a helpful team.

PhD Thesis Proposal: G7.2

Title: “QoS-Brokering Lifecycle with Automated Monitoring for Generic Applications

and Networked Data Intensive Systems”

Keywords: QoS-Broker, Distributed Programming

Supervisor: Prof. Pedro Furtado (


The Generic QoS-Broker is a piece of Software that can seat anywhere in any system

and whose purpose is to make contracts, monitor and react with any piece of software

in any desired manner to provide factual or optimistic Quality of Service guarantees. It

can also interact with lower level QoS-Brokers, so that in principle any desired QoS

objective can be met in any context. The questions that must be answered are: how

does it work? Given certain contexts, how is it applied? (e.g. mobile, real-time, data

management, etc). Our group has been busy working on some of these issues and has

two projects related to this issue. We have some initial prototypes for these systems. As

a result, we have quite a few interesting and innovative proposals that make very good

PhD works.

PhD Thesis Proposal: G7.3

Title: “Automatic Time-prediction and QoS in Distributed Data-Intensive Systems”

Keywords: QoS-Broker, Distributed Programming

Supervisor: Prof. Pedro Furtado (


The capacity to monitor and control QoS-parameters is an important aspect for today’s

mature systems and software. Quality-of-Service and automatic adaptation are also at

the centre of most current developments in systems and software. There are several

works on these issues. Our group has been working on mixing generic QoS with data

services. We would like to explore this in a typical current distributed computing

platform. The proposal is an opportunity for the PhD candidate to learn and work with

technologies such as condor and java, and to work on exciting algorithms within a

helpful team. Related Projects: Adapt-DB, current proposal in cooperation with other

CISUC group.

PhD Thesis Proposal: G7.4

Title: “Optimal Caching, Replication, Placement and Compression of Streams for

Multimedia Streaming in Mobile and Heterogeneous Systems”

Keywords: QoS, Replication, Placement, Caching

Supervisor: Prof. Pedro Furtado (


Given the rise and convergence of network technologies and of streaming and rich

media content delivery, the caching, replica placement and compression of streams

become increasingly important issues. Given our expertise in replica management,

caching and QoS brokering, we expect to produce innovative proposals in this context

that make very good PhD works. Related Projects: current proposal, in cooperation with

other CISUC group.

PhD Thesis Proposal: G7.5

Title: “Predictability, Portability, Auto-Adaptability and Re-Organization of Data


Keywords: RDBMS, Replication, Placement, Caching, Query Processing

Supervisor: Prof. Pedro Furtado (


Auto-adaptability of Services and Applications is a hot topic currently. We have

developed work on both run-anywhere data management services and generic QoSBroker

Architectures. Now we are particularly interested in providing automatic tools to

determine placement, replica and other adaptation strategies based on our monitoring

and history-analysis capacity. We have a basic prototype and several initial ideas on

these issues. As a result, we have quite a few interesting and innovative proposals that

make very good PhD works. Related Projects: Auto-DWPA.

PhD Thesis Proposal: G7.6

Title: “Data Anywhere, at Anytime with QoS in Heterogeneous Networks”

Keywords: QoS, Replication, Placement, Caching

Supervisor: Prof. Pedro Furtado (


The current trend towards mobile and ubiquitous computing should fundamentally

change our concepts of mostly static data storage and access. In the future, access will

increasingly be mobile from multiple devices, places and through different mediums

and the user will not define or worry about where and how the access happens or where

the contents are located. This represents a paradigm shift for the user from specifying

“where and how” to specifying “Object Properties”. UbiData is an automatic transparent

manager for user content through QoS definition. Users specify all kinds of objects,

access and resource properties and the system handles the objects automatically

(storage, consistency, availability, other QoS parameters). Related Projects: Adapt-DB,

current proposals;

PhD Thesis Proposal: G7.7

Title: “QoS and Replica-based Strategies for QoS-Brokering in Adaptable Data


Keywords: QoS, Transactional Systems

Supervisor: Prof. Pedro Furtado (


Auto-adaptability of Services and Applications is a hot topic currently. We have

developed work on both run-anywhere data management services and generic QoSBroker

Architectures. Now we are working in adopting interesting QoS and Replication

strategies in transactional environments. As a result, we have quite a few interesting

and innovative proposals that make very good PhD works. Related Projects: Adapt-DB;

PhD Thesis Proposal: G7.8

Title: “Timely ACID Transactions in DBMS”

Keywords: Databases, transaction processing, performance and QoS, timely

transactions, real-time databases, fault-tolerance

Supervisors: Prof. Marco Vieira (

Prof. Henrique Madeira (


On time data management is becoming a key difficulty faced by the information

infrastructure of most organizations. A major problem is the capability of database

applications to access and update data in a timely manner. In fact, database

applications for critical areas (e.g., air traffic control, factory production control, etc.)

are increasingly giving more importance to the timely execution of transactions.

Database applications with timeliness requirements have to deal with the possible

occurrence of timing failures, when the operations specified in the transaction do not

complete within the expected deadlines. For instance, in a database application

designed to manage information about a critical activity (e.g., a nuclear reactor), a

transaction that reads and store the current reading of a sensor must be executed in a

short time as the longer it takes to execute the transaction the less useful the reading

becomes. This way, when a transaction is submitted and it does not complete before a

specified deadline that transaction becomes irrelevant and this situations must be

reported to the application/business layer in order to be handled in an adequate way.

In spite of the importance of timeliness requirements in database applications,

commercial DBMS do not assure any temporal properties, not even the detection of the

cases when the transaction takes longer than the expected/desired time. The goal of

this work is to bring timeliness properties to the typical ACID (atomicity, consistency,

integrity, durability) transactions, putting together classic database transactions and

recent achievements in the field of real time and distributed transactions. This work will

be developed in the context of the TACID (Timely ACID Transactions in DBMS) research

project, POSC/EIA/61568/2004, funded by FCT.


The PhD work will comprehend the following initial tasks: (a) overview of the state-ofthe-

art about timely computing and real-time databases; (b) characterization of timed

transactions; (c) analysis of DBMS core implementations; (d) infrastructure to support

timely execution of ACID transactions; (e) development of mathematical techniques to

forecast transactions execution times; (f) Implementation and evaluation; (g) writing


PhD Thesis Proposal: G7.9

Title: “Security Benchmarking for Transactional Systems”

Keywords: Security, benchmarking, database management systems, transactional


Supervisors: Prof. Marco Vieira (

Prof. Henrique Madeira (


One of the main problems faced by organizations is the protection of their data against

unauthorized access or corruption due to malicious actions. Database management

systems (DBMS) constitute the kernel of the information systems used today to support

the daily operations of most organizations and represent the ultimate layer in

preventing unauthorized access to data stored in information systems. In spite of the

key role played by the DBMS in the overall data security, no practical way has been

proposed so far to characterize the security in such systems or to compare alternative

solutions concerning security features. Benchmarks are standard tools that allow

evaluating and comparing different systems or components according to specific

characteristics (e.g., performance, robustness, dependability, etc.). In this work we are

particularly interested in benchmarking security aspects of transactional systems.

Thus, the main goal is to research ways to compare transactional systems from a

security point-of-view. This work will be developed in the context of a research

cooperation with the Center for Risk and Reliability of the University of Maryland, MA,

USA. During this work the student will have the opportunity to visit the University of

Maryland in order to carry out joint work with local researchers.


The PhD work will comprehend the following initial tasks: (a) overview of the state-ofthe-

art about security, security evaluation, and dependability benchmarking; (b)

definition of a security benchmarking approach for transactional systems; (c) study of

attacks for security benchmarking; (d) definition of a standard approach for security

evaluation and comparison; (e) implementation and evaluation; (f) writing papers;

PhD Thesis Proposal: G7.10

Title: “Dependability Benchmarking for Distributed and Parallel Database


Keywords: Databases, distributed systems, parallel systems, dependability

benchmarking, fault-tolerance

Supervisors: Prof. Marco Vieira (

Prof. Henrique Madeira (


The ascendance of networked information in our economy and daily lives has increased

the awareness of the importance of dependability features. In many cases, such as in ecommerce

systems, service outages may result in a huge loss of money or in an

unaffordable loss of prestige for companies. In fact, due to the impressive growth of the

Internet, some minutes of downtime in a server somewhere may be directly exposed as

loss of service to thousands of users around the world.

Database systems constitute the kernel of the information systems used today to

support the daily operations of most organizations. Additionally, in recent years, there

has been an explosive growth in the use of databases for decision support decision

support systems. The biggest differences between decision support systems and

operational systems, besides their different goal, are the type of operations executed

and the supporting database platform. While operational systems execute thousands or

even millions of small transactions per day, decision support systems only execute a

small number of queries on the data (in addition to the loading operations executed


Advanced database technology, such as parallel and distributed databases, is a way to

achieve high performance and availability in both operational and decision support

systems. However, although distributed and parallel database systems are increasingly

being used in complex business-critical systems, no practical way has been proposed so

far to characterize the impact of faults in such environments or to compare alternative

solutions concerning dependability features. The fact that many businesses require very

high dependability for their database servers shows that a practical tool that allows the

comparison of alternative solutions in terms of dependability is of utmost importance.

In spite of the pertinence of having dependability benchmarks for distributed and

parallel database systems, the reality is that no dependability benchmark has been

proposed so far. A dependability benchmark is a specification of a standard procedure

to assess dependability related measures of a computer system or computer

component. The awareness of the importance of dependability benchmarks has

increased in the recent years and dependability benchmarking is currently the subject

of strong research. In a previous work the first know dependability benchmark for

transactional systems has been proposed. However, this benchmark focuses singleserver

transactional databases. The goal of this work proposal is to study the problem of

dependability benchmarking in distributed and parallel databases. One of the key

aspects to be addressed is to figure out how to apply a faultload (set of faults and

stressful conditions that emulate real faults experienced by systems in the field) in a

distributed/parallel environment. Several types of faults will be considered, namely:

operator faults, software faults, and hardware faults (including network faults).


The PhD work will comprehend the following initial tasks: (a) overview of the state-ofthe-

art about parallel and distributed databases, dependability assessment and

dependability benchmarking; (b) definition of a dependability benchmarking approach

for distributed and parallel databases; (c) study of typical faults in distributed and

parallel databases environments; (d) definition of a standard approach for dependability

evaluation and comparison in distributed and parallel databases; (e)implementation

and evaluation; (f) writing papers;

PhD Thesis Proposal: G7.11

Title: “Techniques to Improve Performance in Affordable Data Warehouses”

Keywords: Data warehousing, on-line analytical processing (OLAP), performance,

parallel and distributed databases

Supervisors: Prof. Jorge Bernardino (

Prof. Henrique Madeira (


In recent years, there has been an explosive growth in the use of databases for decision

support. These systems, generically called Data Warehouses, involve manipulations of

massive amounts of data that push database management technology to the limit,

especially in what concerns to performance and scalability. In fact, typical data

warehouse utilization has an interactive characteristic, which assumes short query

response time. Therefore, the huge data volumes stored in a typical data warehouse and

the queries complexity with their intrinsic ad-hoc nature make the performance of

query execution the central problem of large data warehouses. The main goal of this

work is to investigate ways to allow a dramatic reduction of the hardware, software, and

administration cost when compared to traditional data warehouses. The affordable data

warehouses solution will be built upon the high scalability and high performance of the

DWS (Data Warehouse Stripping) technology. Starting from the classic method of

uniform partitioning at low level (facts), DWS includes a new technique that distributes

a data warehouse by an arbitrary number of computers. Queries are executed in

parallel by all the computers, guaranteeing a nearly optimal speedup.

This work will focus various aspects related to:

• Automatic data balancing: As each node in the cluster may have different processing

capabilities, it is important to provide load balancing algorithms that automatically

provide the best data distribution. With these mechanisms the system will be able to

reorganize the data whenever it is needed, in order to make the load in each node as

balanced as possible, allowing similar response times for every nodes.

• Auto administration and tuning: by using a cluster of machines the administration

complexity and costs tend to increase dramatically. Although the several nodes have

normally similar configurations, some discrepancies are expected due to the

heterogeneous nature of the cluster. To achieve the best configuration we need to tune

each node inpidually. Thus, we have to develop a solution for automatic

administration and tuning in distributed data warehouses that allows a reduction of the

administration cost and an efficient use of the system resources.


The PhD work will comprehend the following initial tasks: (a) overview of the state-ofthe-

art; (b) characterization of affordable data warehouses requirements; (c)

infrastructure for improved performance in affordable data warehouses; (d)

implementation and evaluation; (e) writing papers;

PhD Thesis Proposal: G7.12

Title: “Towards High Dependability in Affordable Data Warehouses”

Keywords: Data warehousing, on-line analytical processing (OLAP), fault-tolerance,

data security, parallel and distributed databases

Supervisors: Prof. Marco Vieira (

Prof. Henrique Madeira (


In recent years, there has been an explosive growth in the use of databases for decision

support. These systems, generically called Data Warehouses, involve manipulations of

massive amounts of data that push database management technology to the limit,

especially in what concerns to performance and dependability. The huge data volumes

stored in a typical data warehouse make performance and availability two centrals

problem of large data warehouses. An affordable data warehouses solution is being built

upon the high scalability and high performance of the DWS (Data Warehouse Stripping)

technology. Starting from the classic method of uniform partitioning at low level (facts),

DWS includes a new technique that distributes a data warehouse by an arbitrary

number of computers. The fact that the data warehouse is distributed over a large

number of computers raises new challenges as the probability of failure of one or more

computers greatly increases. The main goal of this work is to investigate ways to

achieve high-dependability in the affordable data warehouses solution while allowing a

dramatic reduction of the hardware, software, and administration cost when compared

to traditional data warehouses. Thus, this work will focus various aspects related to:

– Data security: the affordable data warehouses solution will be based on open-source

database management systems (DBMS). However, these databases do not provide the

security mechanisms normally available in commercial DBMS. In addition, the

distributed database approach increases the data security requirements. The goal is to

investigate the security needs for distributed data warehouses over open source DBMS

and to propose advanced mechanisms that improve the overall system security.

– Data replication and recovery: comparing to a single-server database, one of the

consequences of the use of a cluster of affordable machines is the increase of the

probability of failure. This way, one of the goals is to research a new technique for data

replication and recovery that allows the system to continue working in the presence of

failures in several nodes and facilities the recovery of failed nodes.


The PhD work will comprehend the following initial tasks: (a) overview of the state-ofthe-

art; (b) characterization of affordable data warehouses dependability requirements;

(c) infrastructure to support high-dependability in distributed data warehouses; (d)

Implementation and evaluation; (e) writing of papers;

G8: Information Systems

PhD Thesis Proposal: G8.1

Title: “Learning Contexts and Social Networking”

Keywords: e-learning 2.0, e-learning, learning contexts, learning, LMS, social

networking, technology enhanced learning (TEL), web 2.0

Supervisor: Prof. Antonio Dias de Figueiredo (


This thesis concentrates on the design, implementation and management of learning

contexts. The major tenet of our Learning Contexts Project, which has been gaining

strength in the course of over thirty years devoted to ICT and Education, is that the

future of learning is not to be found just on content, but also, and very much, on

context, that is, on making learning happen within activity rich, interaction rich, and

culturally rich social environments that never existed, that the intelligent use of

technology is making possible, and where different paradigms apply. Many of the most

dynamic fields of research in learning and education, such as computer supported

cooperative learning, situated learning, or learning communities relate to learning

contexts. Hundreds of expressions used in education – such as project based learning,

action learning, learning by doing, case studies, scenario building, simulations, role

playing – pertain to learning contexts. The advantage of concentrating on context, as a

whole, rather than on the multiplicity of its manifestations studied by disparate

research groups is that, by doing so, we can articulate that multitude of theories and

practices into a single, coherent, organic, and operational worldview. The proposed

thesis pushes forward our current efforts in this field by exploring the relationship

between Learning Contexts and Social Networking. This may include collaboration with

another of our projects, the Serendipty Project, centred on the development of a

serendipitous social search engine for which we hold a US patent application. In order

to stimulate the creativity of the candidates, plenty of leeway will be given to them, so

that they may choose to concentrate on theoretical aspects, on practical educational

issues, or on the specification and design of the ideal, and yet inexistent, learning

context management system (LXMS).

Prospective candidates whishing to clarify the research implications of learning contexts

may download from the journal Interactive Educational Multimedia our paper “Learning

Contexts: a Blueprint for Research”. Further information can be obtained by

downloading Chapter 1, “Context and Learning: a Philosophical Framework”, of our

book Managing Learning in Virtual Settings: the Role of Context, published by

Information Science Publishing (Idea Group). Successful candidates will have, or be

willing to develop throughout their PhD, a mixed profile of educational technology and

educational and social researcher.

- Figueiredo, A. D. (2005) “Learning Contexts: A Blueprint for Research”, Interactive Educational

Multimedia, No. 11, October 2005

- Figueiredo, A. D. and Afonso, A. P. (2005) “Context and Learning: a Philosophical Framework”,

in Figueiredo, A. D. and A. P. Afonso, Managing Learning in Virtual Settings: The Role of Context,

Information Science Publishing (Idea Group), October 2005.

PhD Thesis Proposal: G8.2

Title: “Quality management and information systems: getting more than the sum of

the parts”

Keywords: Quality Management, Information Systems

Supervisor: Prof. Paulo Rupino (


Quality Management of products, services and business processes is, today, a key issue

for the success of most companies operating in global contexts. In fact, holding a

Quality certification, such as established by ISO 9001:2000 standards, is becoming a

basic requirement for companies to play in several international markets. On the other

hand, the design and deployment of information systems is another key aspect to

consider when modern organizations define their business models and strategy.

It is quite surprising, thus, that in spite of the fact that both quality management and

information systems architecting require intensive strategic analysis and the extensive

involvement of staff and managers in the examination and redesign of business

processes, the two endeavors are still treated as completely distinct. They are usually

conducted as separate projects, handled by different teams, equipped with unconnected


The integrated design of these two pillars of modern organizations, in a manner that

they depend on, support, and reinforce each other, enables a quantum leap, as it lets

organizational tasks be reengineered in the light of: (i) effectiveness, consistency and

evidence of compliance, as required by quality systems; and (ii) efficiency, harnessing

the power of digital information storage, processing, and communication in the renewed

business processes. Typical criticisms to traditional implementations of Quality

Management Systems can also be alleviated, namely by reducing the added

bureaucracy and overhead imposed on users by traditional implementations. The

economic impact on organizations can be considerable, not only at the initial planning

stage but, more importantly, throughout the lifecycle of operation of this unified system.

The likelihood of synergy between quality management and IT infrastructure has been

suggested by a few authors, but no systematic processes for leveraging those synergies

can be found.

A successful Ph.D. in this unexplored field will arm its holder with the skills and tools

to act in an increasingly appealing consulting arena.

PhD Thesis Proposal: G8.3

Title: “Visualizing and Manipulating Work Load Control over Business Networks”

Keywords: Information Systems, Work Load Control Methodology, Human-Computer

Interaction, Information Visualization, Direct Manipulation Interfaces, Delegation,

Interface Agents

Supervisor: Prof. Licínio Roque (


In a previous project we designed a web-based planning and control system for Small

and Medium Enterprises that operated on Make-To-Order clusters (a case in the Mouldmaking

Industry) that implemented and adaptation of the Lancaster proposed Work

Load Control planning methodology. The Work Load Control methodology enables

production management across shopfloors by controlling pending work load levels

across workcentres, thus effectively managing overlapping “windows of opportunity” for

completing every task at each production unit. We have developed a system where the

user sets the planning conditions and delegates in the system the generation of plan

proposals. Unmet restrictions are then iteratively resolved by taking management

decisions that adapt a candidate plan to actual production conditions and vice-versa.

Some of the conclusions of the project relative to the adoption of the new planning tool

were: a) the Work Load Control methodology while particularly flexible and adaptable

for SME running MTO operating models poses a learning obstacle for people trained to

think in time-sliced models (like those depicted in Gantt diagrams); b) a web-based

system while easy to deploy and manage poses a heavy cognitive load as the principal

interaction mode is linguistic (using menus, dialogs, forms, with some graphics for

visualizing resulting plans); c) current business globalization increasingly involves high

levels of subcontract work that needs to be managed across enterprise networks with

only partial knowledge of production conditions, which makes it difficult to use

methodologies that assume full knowledge and control over production units.

With clients, we have come to the conclusion that a planning tool to the Work Load

Control methodology needs a visualization and direct manipulation tool to reduce the

cognitive overhead posed by the complexity and “non-intuitiveness” of the methodology

and enable the person to dynamically envision and track events across networks of

enterprises. This case provides an ideal opportunity to attempt an integration of

linguistic, direct manipulation and delegation modes of interface, develop novel

visualizations and test usability evaluation techniques. The research implies acquiring a

knowledge of the methodology, conceiving and studying appropriate solutions for the

case study by designing innovative interaction techniques. Relevance is met in Decision

Support Systems, Human Computer Interaction and, more generally, in the Information

Systems academic and business communities.

PhD Thesis Proposal: G8.4

Title: “Designing Games as Learning Contexts”

Keywords: Information Systems, Human-Computer Interaction, Context

Engineering, Learning Games, Learning Contexts, Social Constructivism

Supervisor: Prof. Licínio Roque (


Several authors have argued the idea that videogames could be exploited as learning

environments. James Paul Gee has written extensibly on the subject of learning from

computer games, noticing that we can hardly ignore the learning that takes place with

this new medium. Mark Prensky argued this idea on the simulation aspects of games.

Raph Koster, a designer and consultant, adopts the perspective that games are actually

meant to be learned, and makes playing as learning the basis for game design. Simon

Engenfeldt-Nielsen produced a PhD thesis on the educational potential of computer

games, by analyzing cases that used commercially available games.

Seeing games in the light of socio-technical theories such as Actor-Network we have

come to an interpretation of games as constructions designed and built to enforce

specific “programs of action” while its storytelling and underlying rules provide the

basis for stable translation regimes by the player. Games as simulations can also be

understood as embodied theories of physical or social reality. By providing specific

embodied concepts and representations, physical or other abstract rules, characters

with recognizable behavior, the designer builds a learning context with conditions of

engagement that concur to enable playful experiences. These can be more or less

flexible or open to interpretation through player choices, but always enforcing

underlying worldviews or inscribed theories that are meant to be learned in the course

of playing the game if the player is to achieve the game’s goal. It is this activity

conditioning that we suppose to be the usable basis for explicitly conceiving games as

learning devices.

An alternative approach would be to take game design itself as the learning activity and

explore the learning potential inherent in the design activity. Either way, the research

should focus on the methodological problem of explicitly modeling and building games

as learning contexts. The Context Engineering approach can be used to frame

development of specific contexts, by prototyping on available multiplayer game

development technology and focusing on design aspects and their relation to the

proposed problem. Adequate evaluation techniques should also be a consideration in

the studied contexts if they are to be socially accepted as effective learning alternatives.

Relevance is expected for the Learning Sciences, Human and Social Sciences,

Information Systems and Human Computer Interaction, Game Studies, Media Studies,

and society at large.

PhD Thesis Proposal: G8.5

Title: “Visual Modeling Language for Game Design”

Keywords: Information Systems, Human-Computer Interaction, Visual Modeling

Language, Games and Design

Supervisor: Prof. Licínio Roque (


Game design is a complex activity that deals with multiple and heterogeneous

concurrent constraints. A game designer has to consider combined effects of multiple

elements effectively requiring a trans-disciplinary background that can range, with

variable intensity, from Humanities to Media Studies, to Psychology and Sociology, to

Aesthetics, to Informatics and Economics. While the advent of a common design

language still seems far in the future, some design patterns can be readily recognized

from the 30+ year history of videogame development. An example of a systematization is

Björk and Holopainen’s “Patterns in Game Design”. Nonetheless, the idea of a visual

language for game modeling and design seems not only possibly and relevant, but a

pressing research goal.

In pursuing the goal of a visual modeling language for game design it is expected that a

knowledge of socio-technical studies of science and technology will give useful insights

into the problems of trans-disciplinary and in particular the use of Actor-Network

Theory constructs as a basis language for the analysis of game contexts. A requisite for

such a language would be that we could build a prototype modeling tool that could be

used to sketch and generate game coding to be used on a current software game

platform. Another basic requisite would be that it could serve as a basis for design

dialogue between people with perse disciplinary backgrounds.

This research project would involve the trans-disciplinary background study and

drafting of o a language prototype that could then be evaluated and evolved by

successive design iterations, against updated usability requirements. Relevance and

adequacy would be judged based on actual empirical design experience and historical

design accounts, in the fields of Information Systems, Human Computer Interaction,

Game Development, Design, Games Studies and Media Studies.

PhD Thesis Proposal: G8.6

Title: “Programming Games by Demonstration (and Learning to Program)”

Keywords: Information Systems, Human-Computer Interaction, Games and Design,

Programming by Demonstration, Programming by Example

Supervisor: Prof. Licínio Roque (


Game programming and development, from scratch, requires advanced skills and

specialties often unavailable to a game designer or an artist, less alone to the proverbial

man-on-the-street. The advent of general purpose game engines and game generation

environments made it simpler and more affordable, or at least a less specialized task, to

be able to develop and deploy complex game scenarios. Yet, the simpler game design

attempt still requires if not the domain of complex programming at least some skills

with a scripting language.

Towards democratizing this technology and the videogame medium, and taking the

historical lesson from what happened with the appearance of personal filming cameras

and the development of the cinema, again with video and the TV, it seems interesting to

work on a solution that would enable a wider public to become an active participant in

the creation of interactive content. Resorting to and evolving programming by

demonstration or programming by example techniques could play a significant part in

lowering the learning barrier to achieve useful effects analogous to behavior scripting.

Building a system that would enable reflexive action by letting the user inspect and

animate the programming results of demonstrative actions, could serve as a basis for

semi-autonomous learning of programming concepts and skills.

It is intend that the candidate researcher pursues this goal by designing and building a

prototype system on top of an existing software game platform or virtual environment

and proceed to evaluate the generated concept through actual empirical cases with

targeted user segments. Relevance and adequacy would be judged based on actual

empirical design experience, with results published in the fields of Information Systems,

Human Computer Interaction, Game Development, Design, Games Studies and Media


G9: Evolutionary and Complex Systems

PhD Thesis Proposal: G9.1

Title: “Evolving Representations for Evolutionary Algorithms”

Keywords: Evolutionary Computation, Gene Regulatory Networks, Self-Organization,

Supervisor: Prof. Ernesto Costa (


Evolutionary Algorithms typically approach the genotype - phenotype relationship in a

simple way. As a matter of fact, conventional EAs consider the genotype as the complex

structure and rely on more or less simple mechanisms to do the mapping from genotype

to the phenotype. Some work is being done on the relation between the user-designed

representations used by the EA (the genotype) and the fitness landscape induced by the

problem (the phenotype). The idea is to understand the role played by representations

for improving evolvability. This is important but we can advance a step further.

Exploring ideas from developmental biology in the context of evolutionary algorithms is

not new. Notwithstanding, the challenge here is to understand better how we can

combine the theory of evolution with embryonic development in an unified framework

and explore it computationally aiming at evolving the representations to be explored by

an EA instead of design them offline, attaining a self-organized evolutionary algorithm

(SOEA). To achieve that goal we have to identify the building blocks for representations

as well as the transformational rules that end up in the definition of an adapted


PhD Thesis Proposal: G9.2

Title: “Harnessing Dynamic Environments: the problem of prediction”

Keywords: Evolutionary Computation, Dynamic Environments, Prediction

Supervisors: Prof. Ernesto Costa (, Prof. Juergen Branke


Most real-world optimization problems dynamically change over time (e.g. scheduling,

routing, transportation or robot’s navigation problems). In such cases, the task of an

optimization algorithm changes from finding an optimal solution to continuously adapt

an existing solution to the changing environment. Nature-inspired optimization

algorithms have proven to be successful candidates for such problems.

When changes occur repeatedly it should be possible to learn from the experience and

predict future changes of the environment. Such predictions, even if uncertain, could

help the algorithm to make decisions that prepare it for what is to come, allowing an

even faster adaptation and avoid getting stuck in "dead ends". In the context of dynamic

environments all work has focused on enabling the algorithm to adapt quickly after a

change. Only very few papers have attempted to anticipate changes. So far, there is no

fundamental and general investigation on the importance and possibilities to integrate

prediction into nature-inspired optimization. This thesis would aim at making some

fundamental investigations into the role and potential of prediction in dynamic

environments, and at developing some new and effective ways to integrate prediction

into various nature-inspired optimization heuristics.

PhD Thesis Proposal: G9.3

Title: “Evolutionary Hybridization with State-of-the-art Exact Methods ”

Keywords: Evolutionary computation, optimization, hybridization.

Supervisor: Prof. Francisco Pereira (


Standard Evolutionary Algorithms (EA) often perform poorly when searching for good

solutions for complex optimization problems and may benefit if they are combined with

other techniques. Broadly speaking we can consider two large classes of hybrid

architectures: the EA can be complemented with other search methods or it can be

enhanced with problem specific heuristics that add explicit knowledge about the

problem being solved. A drawback associated with research conducted on this topic is

that many reported approaches are typically somewhat naive in nature and with a

limited applicability. Moreover EAs are, in most situations, combined with basic

standard procedures such as hill-climbing algorithms or simulated annealing.

This project aims at conducting an inclusive study of evolutionary hybridization to

analyze if it is possible to develop new architectures that perform better than today’s

methods. Special attention will be given to hybridization with exact algorithms, like

linear programming or gradient-based search. The challenge is to understand how the

key properties of these exact techniques, such as the capability to reduce the search

space or the effective exploration of neighborhoods, might be used by the EA to

efficiently perform a global exploration of the search space. Several examples of

optimization problems will be used to perform a comprehensive analysis of the

developed hybrid architectures.

